scispace - formally typeset
Search or ask a question

Showing papers by "My T. Thai published in 2018"


Journal ArticleDOI
TL;DR: The authors summarize existing efforts and discuss the promising future of their integration, seeking to answer the question: What can smart, decentralized, and secure systems do for the authors' society?
Abstract: AI and blockchain are among the most disruptive technologies and will fundamentally reshape how we live, work, and interact. The authors summarize existing efforts and discuss the promising future of their integration, seeking to answer the question: What can smart, decentralized, and secure systems do for our society?

212 citations


Journal ArticleDOI
TL;DR: A hybrid ABC-TS algorithm combining artificial bee colony (ABC) and Tabu Search (TS) to solve the parallel machine scheduling problem with deteriorating maintenance activities, parallel-batching processing, and deteriorating jobs is developed.

65 citations


Proceedings ArticleDOI
29 May 2018
TL;DR: This paper highlights a new form of distributed denial of service (DDoS) attack that impacts the memory pools of cryptocurrency systems causing massive transaction backlog and higher mining fees and proposes countermeasures to contain such an attack.
Abstract: In this paper, we highlight a new form of distributed denial of service (DDoS) attack that impacts the memory pools of cryptocurrency systems causing massive transaction backlog and higher mining fees. Towards that, we study such an attack on Bitcoin mempools and explore its effects on the mempool size and transaction fees paid by the legitimate users. We also propose countermeasures to contain such an attack. Our countermeasures include fee-based and age-based designs, which optimize the mempool size and help to counter the effects of DDoS attacks. We evaluate our designs using simulations in diverse attack conditions.

49 citations


Journal ArticleDOI
Alan Kuhnle1, Abdul Alim1, Xiang Li1, Huiling Zhang1, My T. Thai1 
TL;DR: In this article, the influence maximization problem on a multiplex, with each layer endowed with its own model of influence diffusion, was studied, and an algorithm knapsack seeding of network (KSN) was proposed.
Abstract: Motivated by online social networks that are linked together through overlapping users, we study the influence maximization problem on a multiplex, with each layer endowed with its own model of influence diffusion. This problem is a novel version of the influence maximization problem that necessitates new analysis incorporating the type of propagation on each layer of the multiplex. We identify a new property, generalized deterministic submodular, which when satisfied by the propagation in each layer, ensures that the propagation on the multiplex overall is submodular–for this case, we formulate influential seed finder (ISF), the greedy algorithm with approximation ratio $(1-1/e)$ . Since the size of a multiplex comprising multiple OSNs may encompass billions of users, we formulate an algorithm knapsack seeding of network (KSN) that runs on each layer of the multiplex in parallel. KSN takes an $\alpha $ -approximation algorithm $A$ for the influence maximization problem on a single network as input, and has approximation ratio $({(1 - \epsilon) \alpha })/({(o + 1)k})$ for arbitrary $\epsilon > 0$ , $o$ is the number of overlapping users, and $k$ is the number of layers in the multiplex. Experiments on real and synthesized multiplexes validate the efficacy of the proposed algorithms for the problem of influence maximization in the heterogeneous multiplex. Implementations of ISF and KSN are available at http://www.alankuhnle.com/papers/mim/mim.html .

39 citations


Posted Content
Alan Kuhnle1, Abdul Alim1, Xiang Li1, Huiling Zhang1, My T. Thai1 
TL;DR: A new property, generalized deterministic submodular, is identified, which when satisfied by the propagation in each layer, ensures that the propagation on the multiplex overall is sub modular–for this case, influential seed finder (ISF).
Abstract: Motivated by online social networks that are linked together through overlapping users, we study the influence maximization problem on a multiplex, with each layer endowed with its own model of influence diffusion This problem is a novel version of the influence maximization problem that necessitates new analysis incorporating the type of propagation on each layer of the multiplex We identify a new property, generalized deterministic submodular, which when satisfied by the propagation in each layer, ensures that the propagation on the multiplex overall is submodular -- for this case, we formulate ISF, the greedy algorithm with approximation ratio $(1 - 1/e)$ Since the size of a multiplex comprising multiple OSNs may encompass billions of users, we formulate an algorithm KSN that runs on each layer of the multiplex in parallel KSN takes an $\alpha$-approximation algorithm A for the influence maximization problem on a single-layer network as input, and has approximation ratio $\frac{(1-\epsilon)\alpha}{(o+1)k}$ for arbitrary $\epsilon > 0$, $o$ is the number of overlapping users, and $k$ is the number of layers in the multiplex Experiments on real and synthesized multiplexes validate the efficacy of the proposed algorithms for the problem of influence maximization in the heterogeneous multiplex Implementations of ISF and KSN are available at this http URL

33 citations


Posted Content
TL;DR: In this paper, the authors proposed a general framework for influence maximization on the integer lattice that generalizes prior works on this topic, and demonstrate the efficiency of their algorithms in this context.
Abstract: The optimization of submodular functions on the integer lattice has received much attention recently, but the objective functions of many applications are non-submodular. We provide two approximation algorithms for maximizing a non-submodular function on the integer lattice subject to a cardinality constraint; these are the first algorithms for this purpose that have polynomial query complexity. We propose a general framework for influence maximization on the integer lattice that generalizes prior works on this topic, and we demonstrate the efficiency of our algorithms in this context.

26 citations


Journal ArticleDOI
TL;DR: It is demonstrated that the maximizing misinformation restriction problem is NP-hard even in the case where the network is a rooted tree at a single misinformation node and it is proved that objective function is monotone and submodular.
Abstract: Online social networks have become popular media worldwide. However, they also allow rapid dissemination of misinformation causing negative impacts to users. With a source of misinformation, the longer the misinformation spreads, the greater the number of affected users will be. Therefore, it is necessary to prevent the spread of misinformation in a specific time period. In this paper, we propose maximizing misinformation restriction ($$\mathsf {MMR}$$MMR) problem with the purpose of finding a set of nodes whose removal from a social network maximizes the influence reduction from the source of misinformation within time and budget constraints. We demonstrate that the $$\mathsf {MMR}$$MMR problem is NP-hard even in the case where the network is a rooted tree at a single misinformation node and show that the calculating objective function is #P-hard. We also prove that objective function is monotone and submodular. Based on that, we propose an $$1{-}1/\sqrt{e}$$1-1/e-approximation algorithm. We further design efficient heuristic algorithms, named $$\mathsf {PR}$$PR-$$\mathsf {DAG}$$DAG to show $$\mathsf {MMR}$$MMR in very large-scale networks.

19 citations


Book ChapterDOI
18 Dec 2018
TL;DR: This paper affirm the correctness on accuracy and efficiency of SSA- fix/D-SSA-fix algorithms, and refuses the misclaims on ‘important gaps’ in the proof of D-Ssa-fix’s efficiency raised by Huang et al.
Abstract: SSA/DSSA were introduced in SIGMOD’16 as the first algorithms that can provide rigorous \(1-1/e-\epsilon \) guarantee with fewer samples than the worst-case sample complexity \(O(nk \frac{\log n}{\epsilon ^2 OPT_k})\). They are order of magnitude faster than the existing methods. The original SIGMOD’16 paper, however, contains errors, and the new fixes for SSA/DSSA, referred to as SSA-fix and D-SSA-fix, have been published in the extended version of the paper [11]. In this paper, we affirm the correctness on accuracy and efficiency of SSA-fix/D-SSA-fix algorithms. Specifically, we refuse the misclaims on ‘important gaps’ in the proof of D-SSA-fix’s efficiency raised by Huang et al. [5] published in VLDB in May 2017. We also replicate the experiments to dispute the experimental discrepancies shown in [5]. Our experiment results indicate that implementation/modification details and data pre-processing attribute for most discrepancies in running-time. (We requested the modified code from VLDB’17 [5] last year but have not received the code from the authors. We also sent them the explanation for the gaps they misclaimed for the D-SSA-fix’s efficiency proof but have not received their concrete feedback.)

18 citations


Book ChapterDOI
18 Dec 2018
TL;DR: With the aid of smart contracts, a system model featuring a trustless access control management mechanism to ensure that users have full control over their data and can track how data are accessed by third-party services is developed.
Abstract: In this paper, we present how blockchain can be leveraged to tackle data privacy issues in Internet of Things (IoT). With the aid of smart contracts, we have developed a system model featuring a trustless access control management mechanism to ensure that users have full control over their data and can track how data are accessed by third-party services. Additionally, we propose a firmware update scheme using blockchain that helps prevent fraudulent data caused by IoT device tampering. Finally, we discuss how our proposed solution can strengthen the data privacy as well as tolerate common adversaries.

16 citations


Proceedings ArticleDOI
02 Jul 2018
TL;DR: A method to learn a threshold model from historical data to characterize the cascades in the power network and alleviate the need of calculating complicated power network dynamics is proposed and message passing equations are introduced to generalize the threshold model in thePower network and the percolation model inThe communication network, based on which efficient solution for finding the most critical nodes in the interdependent networks is derived.
Abstract: The vulnerability of interdependent networks has recently drawn much attention, especially in the key infrastructure networks such as power and communication networks. However, the existing works mainly considered a single cascade model across the networks and there is a need for more accurate models and analysis. In this paper, we focus on the interdependent power/communication networks to accurately analyze their vulnerability by considering heterogeneous cascade models. Accurately analyzing interdependent networks is challenging as the cascades are heterogeneous yet interdependent. Also, including multiple timescales into the context can further increase the complexity. To better depict the vulnerability of interdependent networks, we first propose a method to learn a threshold model from historical data to characterize the cascades in the power network and alleviate the need of calculating complicated power network dynamics. Next, we introduce message passing equations to generalize the threshold model in the power network and the percolation model in the communication network, based on which we derive efficient solution for finding the most critical nodes in the interdependent networks. Removing the most critical nodes can cause the largest cascade and thus characterizes the vulnerability. We evaluate the performance of the proposed methods in various datasets and discuss how network parameters, such as the timescales, can impact the vulnerability.

12 citations


Book ChapterDOI
18 Dec 2018
TL;DR: This paper investigates the Budgeted Competitive Influence Maximization problem within limited budget and time constraints which seeks a seed set nodes of a player or a company to propagate their products’s information while at the same time their competitors are conducting a similar strategy.
Abstract: Influence Maximization (\(\mathsf {IM}\)) is one of the key problems in viral marketing which has been paid much attention recently. Basically, \(\mathsf {IM}\) focuses on finding a set of k seed users on a social network to maximize the expected number of influenced nodes. However, most of related works consider only one player without competitors. In this paper, we investigate the Budgeted Competitive Influence Maximization (\({\mathsf {BCIM}}\)) problem within limited budget and time constraints which seeks a seed set nodes of a player or a company to propagate their products’s information while at the same time their competitors are conducting a similar strategy. We first analyze the complexity of this problem and show that the objective function is neither submodular nor suppermodular. We then apply Sandwich framework to design \({\mathsf {SPBA}}\), a randomized algorithm that guarantees a data dependent approximation factor.

Proceedings ArticleDOI
28 Aug 2018
TL;DR: This work proposes an efficient Combat Seed Selection algorithm to tackle general-threshold activation, in which a measure, “effectiveness”, is defined to evaluate the contribution of nodes to the fight against misinformation.
Abstract: While online social networks (OSNs) have become an important platform for information exchange, the abuse of OSNs to spread misinformation has become a significant threat to our society. To restrain the propagation of misinformation in its early stages, we study the Distance-constrained Misinformation Combat under Uncertainty problem, which aims to both reduce the spread of misinformation and enhance the spread of correct information within a given propagation distance. The problem formulation considers the competitive diffusion of misinformation and correct information. It also accounts for the uncertainty in identifying initial misinformation adopters. For competitive propagation with major-threshold activation, we propose a solution based on stochastic programming and provide an upper-bound in the presence of uncertainty. We propose an efficient Combat Seed Selection algorithm to tackle general-threshold activation, in which we define a measure, "effectiveness", to evaluate the contribution of nodes to the fight against misinformation. Through extensive experiments, we validate that our algorithm outputs high-quality solution with very fast computation.

Proceedings ArticleDOI
01 Oct 2018
TL;DR: A bicriteria greedy algorithm for MFP is presented and incorporated into a dynamic approach to caching from a library of files with evolving popularity distribution and its advantages over other static contact-pattern-aware caching and alternative dynamic approaches are demonstrated.
Abstract: Previous approaches to caching for Device-to-Device (D2D) communication cache popular files during off-peak hours. Since the popularity of content may evolve quickly or be unavailable in advance, we propose a flexible approach to cellular device caching where files are cached or uncached dynamically as file popularity evolves Dynamic caching motivates a space-efficient optimization problem Minimum File Placement (MFP), which is to cache a single file in the least amount of cache space to ensure a specified cache hit rate. In order to estimate the future cache hit rate, we use historical heterogeneous contact and request patterns of the devices. We present a bicriteria greedy algorithm for MFP and incorporate this algorithm into a dynamic approach to caching from a library of files with evolving popularity distribution. In an extensive experimental evaluation, we analyze the effectiveness of our approach to mobile device caching and demonstrate its advantages over other static contact-pattern-aware caching and alternative dynamic approaches.

Proceedings ArticleDOI
16 Apr 2018
TL;DR: This work provides a novel take on the problem of adaptively uncovering network topology in an incomplete network by modeling it with a set of crawlers termed “bots” which can uncover independent portions of the network in parallel.
Abstract: In this work, we examine the problem of adaptively uncovering network topology in an incomplete network, to support more accurate decision making in various real-world applications, such as modeling for reconnaissance attacks and network probing. While this problem has been partially studied, we provide a novel take on it by modeling it with a set of crawlers termed “bots” which can uncover independent portions of the network in parallel. Accordingly, we develop three adaptive algorithms, which make decisions based on previous observations due to incomplete information, namely AGP, a sequential method; FastAGP, a parallel algorithm; and ALSP, an extension of FastAGP uses local search to improve guarantees. These algorithms are proven to have 1/3, 1/7, and 1/ (5 + ∊) approximation ratios, respectively. The key analysis of these algorithms is the connection between adaptive algorithms and an intersection of multiple partition matroids. We conclude with an evaluation of these algorithms to quantify the impact of both adaptivity and parallelism. We find that in practice, adaptive approaches perform significantly better, while FastAGP performs nearly as well as AGP in most cases despite operating in a massively parallel fashion. Finally, we show that a balance between the quantity and quality of bots is ideal for maximizing observation of the network.

Proceedings ArticleDOI
03 Jul 2018
TL;DR: A novel curvature based technique is addressed, showing that an adaptive greedy bot is approximately optimal within a factor of 1 - 1/e1/δ ~0.165, and observing that when the bot is incentivized to befriend friends-of-friends of target users it out-performs a bot that focuses on befriending targets.
Abstract: The explosive growth of Online Social Networks in recent years has led to many individuals relying on them to keep up with friends & family This, in turn, makes them prime targets for malicious actors seeking to collect sensitive, personal data Prior work has studied the ability of socialbots, ie bots which pretend to be humans on OSNs, to collect personal data by befriending real users However, this prior work has been hampered by the assumption that the likelihood of users accepting friend requests from a bot is non-increasing -- a useful constraint for theoretical purposes but one contradicted by observational data We address this limitation with a novel curvature based technique, showing that an adaptive greedy bot is approximately optimal within a factor of 1 - 1/e1/δ ~0165 This theoretical contribution is supported by simulating the infiltration of the bot on OSN topologies Counter-intuitively, we observe that when the bot is incentivized to befriend friends-of-friends of target users it out-performs a bot that focuses on befriending targets

Proceedings ArticleDOI
12 Jun 2018
TL;DR: One of the first algorithms for the length-bounded multicut problem is fully dynamic, capable of updating its solution upon incremental vertex / edge additions or removals from the network while maintaining its performance ratio.
Abstract: Motivated by networked systems in which the functionality of the network depends on vertices in the network being within a bounded distance T of each other, we study the length-bounded multicut problem: given a set of pairs, find a minimum-size set of edges whose removal ensures the distance between each pair exceeds T . We introduce the first algorithms for this problem capable of scaling to massive networks with billions of edges and nodes: three highly scalable algorithms with worst-case performance ratios. Furthermore, one of our algorithms is fully dynamic, capable of updating its solution upon incremental vertex / edge additions or removals from the network while maintaining its performance ratio. Finally, we show that unless NP ⊆ BPP, there is no polynomial-time, approximation algorithm with performance ratio better than Omega (T), which matches the ratio of our dynamic algorithm up to a constant factor.

Proceedings Article
17 May 2018
TL;DR: This work provides two approximation algorithms for maximizing a non-submodular function on the integer lattice subject to a cardinality constraint; these are the first algorithms for this purpose that have polynomial query complexity.
Abstract: The optimization of submodular functions on the integer lattice has received much attention recently, but the objective functions of many applications are non-submodular. We provide two approximation algorithms for maximizing a non-submodular function on the integer lattice subject to a cardinality constraint; these are the first algorithms for this purpose that have polynomial query complexity. We propose a general framework for influence maximization on the integer lattice that generalizes prior works on this topic, and we demonstrate the efficiency of our algorithms in this context.

Posted Content
TL;DR: The simulation results show that the proposed pricing mechanism can increase the fraction of users that achieve their QoS requirements by up to 45% compared to classical algorithms that do not account for users requirements.
Abstract: In this paper, a novel economic approach, based on the framework of contract theory, is proposed for providing incentives for LTE over unlicensed channels (LTE-U) in cellular-networks. In this model, a mobile network operator (MNO) designs and offers a set of contracts to the users to motivate them to accept being served over the unlicensed bands. A practical model in which the information about the quality-of-service (QoS) required by every user is not known to the MNO and other users, is considered. For this contractual model, the closed-form expression of the price charged by the MNO for every user is derived and the problem of spectrum allocation is formulated as a matching game with incomplete information. For the matching problem, a distributed algorithm is proposed to assign the users to the licensed and unlicensed spectra. Simulation results show that the proposed pricing mechanism can increase the fraction of users that achieve their QoS requirements by up to 45\% compared to classical algorithms that do not account for users requirements. Moreover, the performance of the proposed algorithm in the case of incomplete information is shown to approach the performance of the same mechanism with complete information.

Proceedings ArticleDOI
20 May 2018
TL;DR: A novel (2+\theta)(ln|V|+1) approximation algorithm for this problem with a 2- approximation for the special case of tree networks is devised, finding that it performs near-optimally on small networks and significantly outperforms heuristics in all cases.
Abstract: As the existing power grid becomes increasingly complex, the deployment of Smart Grids-which can significantly improve the stability and efficiency of power infrastructure- has seen increasing interest. However, with new technology comes new security concerns. Recent work has shown that fabricating valid but malicious messages on a Smart Grid's SCADA network can cause widespread power outages. Moreover, the large scale, complexity, and tight constraints of these networks makes deploying in-line detection systems insufficient. A common approach is to instead conduct whole-network audits by temporarily duplicating & forwarding all network traffic to a server dedicated to detecting malicious content. This is usually done by taking advantage of port- mirroring to duplicate the packets received with minimal overhead. However, the operation of these audits sees a number of challenges. For instance, each router used to collect traffic demands physical set-up - thus there is a real cost to needlessly high coverage. In this work, we consider the problem of efficiently finding the minimal set of routers in the SCADA network to use for auditing traffic. This efficiency is critical for enabling timely auditing. Similar versions of this problem have seen study. However, they suffer either from a severe mismatch w.r.t. the problem domain, or from serious scalability concerns. This motivates us to devise a novel (2+\theta)(ln|V|+1) approximation algorithm for this problem with a 2- approximation for the special case of tree networks. We experimentally evaluate our solution and compare it to an optimal IP formulation, finding that it performs near-optimally on small networks and significantly outperforms heuristics in all cases.

Journal ArticleDOI
TL;DR: This work proposes two new reference-based signaling network construction methods that provide optimal results and scale well to large-scale signaling networks of hundreds of components on synthetic, semi-synthetic, and real datasets.
Abstract: Signaling networks are involved in almost all major diseases such as cancer As a result of this, understanding how signaling networks function is vital for finding new treatments for many diseases Using gene knockdown assays such as RNA interference (RNAi) technology, many genes involved in these networks can be identified However, determining the interactions between these genes in the signaling networks using only experimental techniques is very challenging, as performing extensive experiments is very expensive and sometimes, even impractical Construction of signaling networks from RNAi data using computational techniques have been proposed as an alternative way to solve this challenging problem However, the earlier approaches are either not scalable to large scale networks, or their accuracy levels are not satisfactory In this study, we integrate RNAi data given on a target network with multiple reference signaling networks and phylogenetic trees to construct the topology of the target signaling network In our work, the network construction is considered as finding the minimum number of edit operations on given multiple reference networks, in which their contributions are weighted by their phylogenetic distances to the target network The edit operations on the reference networks lead to a target network that satisfies the RNAi knockdown observations Here, we propose two new reference-based signaling network construction methods that provide optimal results and scale well to large-scale signaling networks of hundreds of components We compare the performance of these approaches to the state-of-the-art reference-based network construction method SiNeC on synthetic, semi-synthetic, and real datasets Our analyses show that the proposed methods outperform SiNeC method in terms of accuracy Furthermore, we show that our methods function well even if evolutionarily distant reference networks are used Application of our methods to the Apoptosis and Wnt signaling pathways recovers the known protein-protein interactions and suggests additional relevant interactions that can be tested experimentally