Security Projects – ElysiumPro

Security Projects

CSE Projects
Description
S Information Security is the practice of preventing unauthorized access, use, disclosure, disruption, modification, inspection, recording or destruction of information. We offer projects to our students on encryption techniques, steganography for secret file transfer and other security applications.
Download Project List

Quality Factor

  • 100% Assured Results
  • Best Project Explanation
  • Tons of Reference
  • Cost optimized
  • Controlpanel Access

1SnapFiner: A Page-Aware Snapshot System for Virtual Machines
Virtual machine (VM) snapshot is an essential part of cloud infrastructures. Unfortunately, the snapshot data are likely to be lost due to disk failures, so that the associated VM fails to recover properly. To enhance data availability without compromising application performance upon rollback recovery, it is desired to place multiple replicas of snapshot across disperse disks. However, due to the large size of replica, it induces non-trivial storage cost when managing massive snapshots in clouds. In this paper, we investigate this problem and find out that the semantic gap exists between snapshot creation and snapshot storing is one key factor inducing high storage cost. To this end, we propose SnapFiner, a page-aware snapshot system for creating and storing snapshots efficiently. First, SnapFiner acquires a fine-grained page categorization with an in-depth page exploration from three orthogonal views, thereby discovering more pages that can be excluded from the snapshot. Second, SnapFiner varies the number of replicas for different page categories based on a page-aware replication policy, achieving low storage cost without compromising availability and performance. Third, SnapFiner handles the loss of pages either intentionally dropped upon snapshot creation or unexpectedly damaged due to disk failures, enabling proper system execution after recovery.
2A Survey of Desktop Grid Scheduling
The paper surveys the state of the art of task scheduling in Desktop Grid computing systems. We describe the general architecture of a Desktop Grid system and the computing model adopted by the BOINC middleware. We summarize research papers to bring together and examine the optimization criteria and methods proposed by researchers so far for improving Desktop Grid task scheduling (assigning tasks to computing nodes by a server). In addition, we review related papers which address non-regular architectures, like hierarchical and peer-to-peer Desktop Grids, as well as Desktop Grids designed for solving interdependent tasks. Finally, we formulate and briefly describe several open problems in the field of Desktop Grid task scheduling.
3A Self-Adaptive Network for HPC Clouds: Architecture, Framework, and Implementation
Clouds offer flexible and economically attractive compute and storage solutions for enterprises. However, the effectiveness of cloud computing for high-performance computing (HPC) systems still remains questionable. When clouds are deployed on lossless interconnection networks, like InfiniBand (IB), challenges related to load-balancing, low-overhead virtualization, and performance isolation hinder full potential utilization of the underlying interconnect. Moreover, cloud data centers incorporate a highly dynamic environment rendering static network reconfigurations, typically used in IB systems, infeasible. In this paper, we present a framework for a self-adaptive network architecture for HPC clouds based on lossless interconnection networks, demonstrated by means of our implemented IB prototype. Our solution, based on a feedback control and optimization loop, enables the lossless HPC network to dynamically adapt to the varying traffic patterns, current resource availability, workload distributions, and also in accordance with the service provider-defined policies. Furthermore, we present IBAdapt, a simplified ruled-based language for the service providers to specify adaptation strategies used by the framework. Our developed self-adaptive IB network prototype is demonstrated using state-of-the-art industry software. The results obtained on a test cluster demonstrate the feasibility and effectiveness of the framework when it comes to improving Quality-of-Service compliance in HPC clouds.

4An Efficient and Fair Multi-Resource Allocation Mechanism for Heterogeneous Servers
Efficient and fair allocation of multiple types of resources is a crucial objective in a cloud/distributed computing cluster. Users may have diverse resource needs. Furthermore, diversity in server properties/capabilities may mean that only a subset of servers may be usable by a given user. In platforms with such heterogeneity, we identify important limitations in existing multi-resource fair allocation mechanisms, notably Dominant Resource Fairness and its follow-up work. To overcome such limitations, we propose a new server-based approach; each server allocates resources by maximizing a per-server utility function. We propose a specific class of utility functions which, when appropriately parameterized, adjusts the trade-off between efficiency and fairness, and captures a variety of fairness measures (such as our recently proposed Per-Server Dominant Share Fairness). We establish conditions for the proposed mechanism to satisfy certain properties that are generally deemed desirable, e.g., envy-freeness, sharing incentive, bottleneck fairness, and Pareto optimality. To implement our resource allocation mechanism, we develop an iterative algorithm which is shown to be globally convergent. Subsequently, we show how the proposed mechanism could be implemented in a distributed fashion. Finally, we carry out extensive trace-driven simulations to show the enhanced performance of our proposed mechanism over the existing ones.
5Expressive Content-Based Routing in Software-Defined Networks
With the vision of Internet of Things gaining popularity at a global level, efficient publish/subscribe middleware for communication within and across data centers is extremely desirable. In this respect, the very popular Software-Defined Networking, which enables publish/subscribe middleware to perform line-rate filtering of events directly on hardware, can prove to be very useful. While deploying content filters directly on switches of a software-defined network allows optimized paths, high throughput rates, and low end-to-end latency, it suffers from certain inherent limitations with respect to number of bits available on hardware switches to represent these filters. Such a limitation affects expressiveness of filters, resulting in unnecessary traffic in the network.
6mSNP: A Massively Parallel Algorithm for Large-Scale SNP Detection
Single Nucleotide Polymorphism (SNP) detection is a fundamental procedure of whole genome analysis. SOAPsnp, a classic tool for detection, would take more than one week to analyze one typical human genome, which limits the efficiency of downstream analyses. In this paper, we present mSNP, an optimized version of SOAPsnp, which leverages Intel Xeon Phi coprocessors for large-scale SNP detection. Firstly, we redesigned the essential data structures of SOAPsnp, which significantly reduces memory footprint and improves computing efficiency. Then we developed a coordinated parallel framework for a higher hardware utilization of both CPU and Xeon Phi. Also, we tailored the data structures and operations to utilize the wide VPU of Xeon Phi to improve data throughput. Last but not the least, we proposeed a read-based window division strategy to improve throughput and obtain better load balance. mSNP is the first SNP detection tool empowered by Xeon Phi. We achieved a 38x single thread speedup on CPU, without any loss in precision. Moreover, mSNP successfully scaled to 4,096 nodes on Tianhe-2. Our experiments demonstrate that mSNP is efficient and scalable for large-scale human genome SNP detection.
7Multi-objective Optimization for Virtual Machine Allocation and Replica Placement in Virtualized Hadoop
Resource management is a key factor in the performance and efficient utilization of cloud systems, and many research works have proposed efficient policies to optimize such systems. However, these policies have traditionally managed the resources individually, neglecting the complexity of cloud systems and the interrelation between their elements. To illustrate this situation, we present an approach focused on virtualized Hadoop for a simultaneous and coordinated management of virtual machines and file replicas. Specifically, we propose determining the virtual machine allocation, virtual machine template selection, and file replica placement with the objective of minimizing the power consumption, physical resource waste, and file unavailability. We implemented our solution using the non-dominated sorting genetic algorithm-II, which is a multi-objective optimization algorithm. Our approach obtained important benefits in terms of file unavailability and resource waste, with overall improvements of approximately 400% and 170% compared to three other optimization strategies. The benefits for the power consumption were smaller, with an improvement of approximately 1.9%.
8Minimal Cost Server Configuration for Meeting Time-Varying Resource Demands in Cloud Centers
We consider the minimal cost server configuration for meeting resource demands over multiple time slots. Specifically, there are some heterogeneous servers. Each server is specified by a cost, certain amounts of several resources, and an active interval, i.e., the time interval that the server is planed to work. There are different overall demands for each type of resource over different time slots. A feasible solution is a set of servers such that at any time slot, the resources provided by the selected servers are at least their corresponding demands. Notice that, a selected server can not provide resources for the time slots out of its active interval. The total cost of the solution is the summation of the costs of all selected servers. The goal is to find a feasible solution with minimal total cost. To solve our problem, we present a randomized approximation algorithm called partial rounding algorithm (\mathcal{PRA}), which guarantees O \left(\log\;\left(KT \right) \right)-approximation, i.e., \eta\;\log\;\left(KT\right)-approximation, where \eta is a positive constant. Furthermore, to minimize \eta as much as possible, we propose a varied Chernoff bound and apply it in \mathcal{PRA}. We perform extensive experiments with random inputs and a specific application input. The results show that \mathcal{PRA} with our varied Chernoff conclusion can find solutions closing to the optimal one.
9BiGNoC: Accelerating Big Data Computing with Application-Specific Photonic Network-on-Chip Architectures
In the era of big data, high performance data analytics applications are frequently executed on large-scale cluster architectures to accomplish massive data-parallel computations. Often, these applications involve iterative machine learning algorithms to extract information and make predictions from large data sets. Multicast data dissemination is one of the major performance bottlenecks for such data analytics applications in cluster computing, as terabytes of data need to be distributed frequently from a single data source to hundreds of computing nodes. To overcome this bottleneck for big data applications, we propose BiGNoC, a manycore chip platform with a novel application-specific photonic network-on-chip (PNoC) fabric. BiGNoC is designed for big data computing and exploits multicasting in photonic waveguides. For high performance data analytics applications, BiGNoC improves throughput by up to 9.9× while reducing latency by up to 88% and energy-per-bit by up to 98% over two state-of-the-art PNoC architectures as well as a broadcast-optimized electrical mesh NoC architecture, and a traditional electrical mesh NoC architecture.
10Early Identification of Critical Blocks: Making Replicated Distributed Storage Systems Reliable against Node Failures
In large-scale replicated distributed storage systems consisting of hundreds to thousands of nodes, node failures are not rare and can cause data blocks to lose their replicas and become faulty. A simple but effective approach to prevent data loss from the node failures is to shorten the identification time of faulty blocks, which is determined by both timeouts and check intervals for node states. However, to maintain low repair network traffic, the identification time is actually relatively long and even dominates repair processes of critical blocks. In this paper, we propose a novel scheme, named RICK, to explore potential in the identification time, and thus improve data reliability of replicated distributed storage systems while maintaining a low repair cost. First, by introducing an additional replica state, critical blocks (with two or more lost replicas) have individual short timeouts while sick blocks (with only one lost replica) preserve the long timeouts. Second, by replacing the static check intervals for node states with adaptive ones, the check intervals and the identification time of critical blocks are further shortened, which improves data reliability. Meanwhile, due to the low ratio of critical blocks in all faulty blocks, the repair network traffic remains low. The results from our simulation and prototype implementation show that RICK improves data reliability by a factor of up to 14. Meanwhile, the extra repair network traffic caused by RICK is less than 1.5% of the total network traffic for data repairs.
11Towards Stable Flow Scheduling in Data Centers
At present, soft real-time data center applications are in a booming development and impose stringent delay requirements on internal data transfers. In this context, many recently proposed data center transport protocols share a common goal of minimizing Flow Completion Time (FCT), and the Shortest Remaining Processing Time (SRPT) scheduling algorithm has attracted widespread attentions for its superior performance in average FCT. However, SRPT suffers from the instability problem, incurring more and more flows left uncompleted even if the traffic load is within the fabric capacity, which implies unnecessary bandwidth waste. To solve the problem, this paper proposes a backlog-aware flow scheduling algorithm (BASRPT) for both giant switch and general topologies. Because of taking into account queue backlogs other than flow sizes at scheduling, we prove that BASRPT is stable and still maintains good FCT performance. To overcome the huge computation overhead and enable distributed implementation, a fast and practical approximation algorithm called fast BASRPT is also developed. Extensive flow-level simulations show that fast BASRPT indeed stabilizes the queue length and obtains a higher throughput while being able to push the FCT arbitrarily close to the optimal value in the condition of feasible traffic loads.
12Dynamic Resource Scheduling in Mobile Edge Cloud with Cloud Radio Access Network
Nowadays, by integrating the cloud radio access network (C-RAN) with the mobile edge cloud computing (MEC) technology, mobile service provider (MSP) can efficiently handle the increasing mobile traffic. Previous work often studied the power consumption in C-RAN and MEC separately while less work had considered the integration of C-RAN with MEC. In this paper, we present a unifying framework for the power-performance tradeoff of MSP by jointly resource scheduling to maximize the profit of MSP. To achieve this objective, we formulate the resource scheduling issue as a stochastic problem and design a new optimization framework by using an extended Lyapunov technique. Based on the optimization framework, we design the VariedLen algorithm to make online decisions in consecutive time for job requests with variable lengths. Our proposed algorithm can reach time average profit that is close to the optimum with a diminishing gap (1/V) for the MSP. With extensive simulations based on a real world trace, we demonstrate that the profit of VariedLen algorithm is 1.5X (1.9X) higher than that of active (random) algorithm.
13Improving Restore Performance in Deduplication-based Backup Systems via a Fine-Grained Defragmentation Approach
In deduplication-based backup systems, the removal of redundant data transforms the otherwise logically adjacent data chunks into physically scattered chunks on the disks. This, in effect, changes the retrieval operations from sequential to random and significantly degrades the performance of restoring data. These scattered chunks are called fragmented data and many techniques have been proposed to identify and sequentially rewrite such fragmented data to new address areas, trading off the increased storage space for reduced number of random reads (disk seeks) to improve the restore performance. However, existing solutions for backup workloads share a common assumption that every read operation involves a large fixed-size window of contiguous chunks, which restricts the fragment identification to a fixed-size read window. This can lead to inaccurate identifications due to false positives since the data fragments can vary in size and appear in any different and unpredictable address locations.
14A Novel Data-Partitioning Algorithm for Performance Optimization of Data-Parallel Applications on Heterogeneous HPC Platforms
Modern HPC platforms have become highly heterogeneous owing to tight integration of multicore CPUs and accelerators (such as Graphics Processing Units, Intel Xeon Phis, or Field-Programmable Gate Arrays) empowering them to maximize the dominant objectives of performance and energy efficiency. Due to this inherent characteristic, processing elements contend for shared on-chip resources such as Last Level Cache (LLC), interconnect, etc. and shared nodal resources such as DRAM, PCI-E links, etc. This has resulted in severe resource contention and Non-Uniform Memory Access (NUMA) that have posed serious challenges to model and algorithm developers. Moreover, the accelerators feature limited main memory compared to the multicore CPU host and are connected to it via limited bandwidth PCI-E links thereby requiring support for efficient out-of-card execution.
15A Survey on Recent OS-level Energy Management Techniques for Mobile Processing Units
To improve mobile experience of users, recent mobile devices have adopted powerful processing units (CPUs and GPUs). Unfortunately, the processing units often consume a considerable amount of energy, which in turn shortens battery life of mobile devices. For energy reduction of the processing units, mobile devices adopt energy management techniques based on software, especially OS (Operating Systems), as well as hardware. In this survey paper, we summarize recent OS-level energy management techniques for mobile processing units. We categorize the energy management techniques into three parts, according to main operations of the summarized techniques: 1) techniques adjusting power states of processing units, 2) techniques exploiting other computing resources, and 3) techniques considering interactions between displays and processing units. We believe this comprehensive survey paper will be a useful guideline for understanding recent OS-level energy management techniques and developing more advanced OS-level techniques for energy-efficient mobile processing units.
16Online Auction for IaaS Clouds: towards Elastic User Demands and Weighted Heterogeneous VMs
Auctions have been adopted by many major cloud providers, such as Amazon EC2. Unfortunately, only simple auctions have been implemented. Such simple auction has serious limitations, such as being unable to accept elastic user demands and having to allocate different types of VMs independently. These limitations create a big gap between the real needs of cloud users and the available services of cloud providers. In response to the limitations of the existing auction mechanisms, this paper proposes a novel online auction mechanism for IaaS clouds, with the unique features of an elastic model for inputting time-varying user demands and a unified model for requesting heterogeneous VMs together. However, several major challenges should be addressed, such as NP hardness of optimal VM allocation, time-varying user demands and potential misreports of private information of cloud users. We propose a truthful online auction mechanism for maximizing the profit of the cloud provider in IaaS clouds, which is composed of a price-based allocation rule and a payment rule. In the allocation rule, the online auction mechanism determines the number of VMs of each type to each user. In the payment rule, by introducing a marginal price function for each type of VMs, the mechanism determines how much the cloud provider should charge each cloud user. We demonstrate that our mechanism is truthful, fair and individually rational, and has a polynomial-time complexity. In addition, our auction achieves a competitive ratio for the profit of the cloud provider, compared against the offline optimal one.
17Firework: Data Processing and Sharing for Hybrid Cloud-Edge Analytics
Now we are entering the era of Internet of Everything (IoE) and billions of sensors and actuators are connected to the network. As one of the most sophisticated IoE application, real-time video analytics is promising to significantly improve public safety, business intelligence, and healthcare&life science, among others. However, cloud-centric video analytics requires that all video data must be preloaded to a centralized cluster or Cloud, which suffers high response latency and formidable cost of data transmission, given the scale of zettabytes video data generated by IoE devices. Furthermore, there is no efficient programming interface for developers and end users to easily program and deploy IoE applications across geographically distributed computation resources. In this paper, we present a new computing paradigm, Firework that facilitates distributed data processing and sharing for IoE analytics via a virtual shared data view and service composition. An easy-to-use programming interface powered by Firework is provided for developers and end users. This paper describes its system design, implementation, and programming interface. The experimental results of an edge video analytics demonstrate that Firework reduces up to 19.52% of response latency and at least 72.77% of network bandwidth cost, compared to a cloudcentric solution.
18Performance Model of Map Reduce Iterative Applications for Hybrid Cloud Bursting
Hybrid cloud bursting (i.e., leasing temporary off-premise cloud resources to boost the overall capacity during peak utilization) can be a cost-effective way to deal with the increasing complexity of big data analytics, especially for iterative applications. However, the low throughput, high latency network link between the on-premise and off-premise resources ("weak link") makes it difficult to maintain scalability. While there are several data locality techniques dedicated for big data bursting on hybrid clouds, their effectiveness is difficult to estimate in advance. On the other hand, such estimations are critical for users, because they aid in the decision of whether the extra pay-as-you-go cost incurred by using the off-premise resources justifies the runtime speed-up. To this end, the current paper contributes with a performance model and methodology to estimate the runtime of iterative MapReduce applications in a hybrid cloud bursting scenario. A key idea of the proposal is to focus on the overhead incurred by the weak link at fine granularity, both for the map and reduce phase. This enables high estimation accuracy, as demonstrated by extensive experiments at scale using a mix of real-life iterative MapReduce applications from standard big data benchmarking suites that cover a broad spectrum of data patterns. Not only are the produced estimations accurate in absolute terms compared with the actual experimental results, but they are also up to an order of magnitude more accurate than applying state-of-art estimation approaches originally designed for single-site MapReduce deployments.
19Learning-Based Memory Allocation Optimization for Delay-Sensitive Big Data Processing
Optimal resource provisioning is essential for scalable big data analytics. However, it has been difficult to accurately forecast the resource requirements before the actual deployment of these applications as their resource requirements are heavily application and data dependent. This paper identifies the existence of effective memory resource requirements for most of the big data analytic applications running inside JVMs in distributed Spark environments. Provisioning memory less than the effective memory requirement may result in rapid deterioration of the application execution in terms of its total execution time. A machine learning-based prediction model is proposed in this paper to forecast the effective memory requirement of an application given its service level agreement. This model captures the memory consumption behavior of big data applications and the dynamics of memory utilization in a distributed cluster environment. With an accurate prediction of the effective memory requirement, it is shown that up to 60 percent savings of the memory resource is feasible if an execution time penalty of 10 percent is acceptable. The accuracy of the model is evaluated on a physical Spark cluster with 128 cores and 1TB of total memory. The experiment results show that the proposed solution can predict the minimum required memory size for given acceptable delays with high accuracy, even if the behavior of target applications is unknown during the training of the model.
20Power-Aware and Performance-Guaranteed Virtual Machine Placement in the Cloud
Cloud service providers offer virtual machines (VMs) as services to users over Internet. As VMs are running on physical machines (PMs), PM power consumption needs to be considered. Meanwhile, VMs running on the same PM share physical resources, and there exists great resource contention, which results in VM performance degradation. Therefore, how to place VMs to reduce PM power consumption and guarantee VM performance is still one major challenge. However, existing VMPs did not study VM performance degradation, so they could not guarantee VM performance. To solve the high power consumption and VMs performance degradation problems, this paper explores the balance between saving PM power and guaranteeing VM performance, and proposes a power-aware and performance-guaranteed VMP (PPVMP). First, we investigate the relationship between power consumption and CPU utilization to build a non-linear power model, which is helpful for the following VMP. Second, we construct VM performance models to present the VM performance degradation trend. Third, based on these models, we formulate VMP as a bi-objective optimization problem, which tries to minimize PM power consumption and guarantee VM performance. We then propose an algorithm based on ant colony optimization to solve it. Finally, the results show the efficiency of our algorithm.
21A Differentiated Caching Mechanism to Enable Primary Storage Deduplication in Clouds
Existing primary deduplication techniques either use inline caching to exploit locality in primary workloads or use post-processing deduplication to avoid the negative impact on I/O performance. However, neither of them works well in the cloud servers running multiple services for the following two reasons: First, the temporal locality of duplicate data writes varies among primary storage workloads, which makes it challenging to efficiently allocate the inline cache space and achieve a good deduplication ratio. Second, the post-processing deduplication does not eliminate duplicate I/O operations that write to the same logical block address as it is performed after duplicate blocks have been written. A hybrid deduplication mechanism is promising to deal with these problems. Inline fingerprint caching is essential to achieving efficient hybrid deduplication. In this paper, we present a detailed analysis of the limitations of using existing caching algorithms in primary deduplication in the cloud. We reveal that existing caching algorithms either perform poorly or incur significant memory overhead in fingerprint cache management. To address this, we propose a novel fingerprint caching mechanism that estimates the temporal locality of duplicates in different data streams and prioritizes the cache allocation based on the estimation. We integrate the caching mechanism and build a hybrid deduplication system. Our experimental results show that the proposed mechanism provides significant improvement for both deduplication ratio and overhead reduction.
22High-Speed Transfer Optimization Based on Historical Analysis and Real-Time Tuning
Data-intensive scientific and commercial applications increasingly require frequent movement of large datasets from one site to the other(s). Despite growing network capacities, these data movements rarely achieve the promised data transfer rates of the underlying physical network due to poorly tuned data transfer protocols. Accurately and efficiently tuning the data transfer protocol parameters in a dynamically changing network environment is a major challenge and remains as an open research problem. In this paper, we present a novel dynamic parameter tuning algorithm based on historical data analysis and real-time background traffic probing, dubbed HARP. Most of the previous work in this area are solely based on real-time network probing or static parameter tuning, which either result in an excessive sampling overhead or fail to accurately predict the optimal transfer parameters. Combining historical data analysis with real-time sampling lets HARP tune the application-layer data transfer parameters accurately and efficiently to achieve close-to-optimal end-to-end data transfer throughput with very low overhead. Instead of one-time parameter estimation, HARP uses a feedback loop to adjust the parameter values to changing network conditions in real-time. Our experimental analyses over a variety of network settings show that HARP outperforms existing solutions by up to 50 percent in terms of the achieved data transfer throughput.
23Storage, Communication, and Load Balancing Trade-off in Distributed Cache Networks
We consider load balancing in a network of caching servers delivering contents to end users. Randomized load balancing via the so-called power of two choices is a well-known approach in parallel and distributed systems. In this framework, we investigate the tension between storage resources, communication cost, and load balancing performance. To this end, we propose a randomized load balancing scheme which simultaneously considers cache size limitation and proximity in the server redirection process. In contrast to the classical power of two choices setup, since the memory limitation and the proximity constraint cause correlation in the server selection process, we may not benefit from the power of two choices. However, we prove that in certain regimes of problem parameters, our scheme results in the maximum load of order Q (loglogn) (here n is the network size). This is an exponential improvement compared to the scheme which assigns each request to the nearest available replica. Interestingly, the extra communication cost incurred by our proposed scheme, compared to the nearest replica strategy, is small. Furthermore, our extensive simulations show that the trade-off trend does not depend on the network topology and library popularity profile details.
24MeLoDy: A Long-Term Dynamic Quality-Aware Incentive Mechanism for Crowdsourcing
Crowdsourcing allows requesters to allocate tasks to a group of workers on the Internet to make use of their collective intelligence. Quality control is a key design objective in incentive mechanisms for crowdsourcing as requesters aim at obtaining high-quality answers under a limited budget. However, when measuring workers' long-term quality, existing mechanisms either fail to utilize workers' historical information, or treat workers' quality as stable and ignore its temporal characteristics, hence performing poorly in a long run. In this paper we propose MELODY, a long-term dynamic quality-aware incentive mechanism for crowdsourcing. MELODY models interaction between requesters and workers as reverse auctions that run continuously. In each run of MELODY, we design a truthful, individual rational, budget feasible and quality-aware algorithm for task allocation with polynomial-time computation complexity and O(1) performance ratio. Moreover, taking into consideration the long-term characteristics of workers' quality, we propose a novel framework in MELODY for quality inference and parameters learning based on Linear Dynamical Systems at the end of each run, which takes full advantage of workers' historical information and predicts their quality accurately. Through extensive simulations, we demonstrate that MELODYoutperforms existing work in terms of both quality estimation (reducing estimation error by 17.6% ~ 24.2%) and social performance (increasing requester's utility by 18.2% ~ 46.6%) in long-term scenarios.
25Automatic Detection of Large Extended Data-Race-Free Regions with Conflict Isolation
Data-race-free (DRF) parallel programming becomes a standard as newly adopted memory models of mainstream programming languages such as C++ or Java impose data-race-freedom as a requirement. We propose compiler techniques that automatically delineate extended data-race-free (xDRF) regions, namely regions of code that provide the same guarantees as the synchronization-free regions (in the context of DRF codes). xDRF regions stretch across synchronization boundaries, function calls and loop back-edges and preserve the data-race-free semantics, thus increasing the optimization opportunities exposed to the compiler and to the underlying architecture. We further enlarge xDRF regions with a conflict isolation (CI) technique, delineating what we call xDRF-CI regions while preserving the same properties as xDRF regions. Our compiler (1) precisely analyzes the threads' memory accessing behavior and data sharing in shared-memory, general-purpose parallel applications, (2) isolates data-sharing and (3) marks the limits of xDRF-CI code regions. The contribution of this work consists in a simple but effective method to alleviate the drawbacks of the compiler's conservative nature in order to be competitive with (and even surpass) an expert in delineating xDRF regions manually. We evaluate the potential of our technique by employing xDRF and xDRF-CI region classification in a state-of-the-art, dual-mode cache coherence protocol.
26A Relaxation-Based Network Decomposition Algorithm for Parallel Transient Stability Simulation with Improved Convergence
Transient stability simulation of a large-scale and interconnected electric power system involves solving a large set of differential algebraic equations (DAEs) at every simulation time-step. With the ever-growing size and complexity of power grids, dynamic simulation becomes more time-consuming and computationally difficult using conventional sequential simulation techniques. To cope with this challenge, this paper aims to develop a fully distributed approach intended for implementation on High Performance Computer (HPC) clusters. A novel, relaxation-based domain decomposition algorithm known as Parallel-General-Norton with Multiple-port Equivalent (PGNME) is proposed as the core technique of a two-stage decomposition approach to divide the overall dynamic simulation problem into a set of subproblems that can be solved concurrently to exploit parallelism and scalability. While the convergence property has traditionally been a concern for relaxation-based decomposition, an estimation mechanism based on multiple-port network equivalent is adopted as the preconditioner to enhance the convergence of the proposed algorithm. The proposed algorithm is illustrated using rigorous mathematics and validated both in terms of speed-up and capability. Moreover, a complexity analysis is performed to support the observation that PGNME scales well when the size of the subproblems are sufficiently large.
27A Novel Network Structure with Power Efficiency and High Availability for Data Centers
Designing a cost-effective network for data centers that can deliver sufficient bandwidth and provide high availability has drawn tremendous attentions recently. In this paper, we propose a novel server-centric network structure called RCube, which is energy efficient and can deploy a redundancy scheme to improve the availability of data centers. Moreover, RCube shares many good properties with BCube, a well-known server-centric network structure, yet its network size can be adjusted more conveniently. We also present a routing algorithm to find paths in RCube and an algorithm to find multiple parallel paths between any pair of source and destination servers. In addition, we theoretically analyze the power efficiency of the network and availability of RCube under server failure. Our comprehensive simulations demonstrate that RCube provides higher availability and flexibility to make trade-off among many factors, such as power consumption and aggregate throughput, than BCube, while delivering similar performance to BCube in many critical metrics, such as average path length, path distribution and graceful degradation, which makes RCube a very promising empirical structure for an enterprise data center network product.
28MIA: Metric Importance Analysis for Big Data Workload Characterization
Data analytics is at the foundation of both high-quality products and services in modern economies and societies. Big data workloads run on complex large-scale computing clusters, which implies significant challenges for deeply understanding and characterizing overall system performance. In general, performance is affected by many factors at multiple layers in the system stack, hence it is challenging to identify the key metrics when understanding big data workload performance. In this paper, we propose a novel workload characterization methodology using ensemble learning, called Metric Importance Analysis (MIA), to quantify the respective importance of workload metrics. By focusing on the most important metrics, MIA reduces the complexity of the analysis without losing information. Moreover, we develop the MIA-based Kiviat Plot (MKP) and Benchmark Similarity Matrix (BSM) which provide more insightful information than the traditional linkage clustering based dendrogram to visualize program behavior (dis)similarity. To demonstrate the applicability of MIA, we use it to characterize three big data benchmark suites: HiBench, CloudRank-D and SZTS. The results show that MIA is able to characterize complex big data workloads in a simple, intuitive manner, and reveal interesting insights. Moreover, through a case study, we demonstrate that tuning the configuration parameters related to the important metrics found by MIA results in higher performance improvements than through tuning the parameters related to the less important ones.
29Energy Efficiency Aware Task Assignment with DVFS in Heterogeneous Hadoop Clusters
To address the computing challenge of `big data', a number of data-intensive computing frameworks (e.g., MapReduce, Dryad, Storm and Spark) have emerged and become popular. YARN is a de facto resource management platform that enables these frameworks running together in a shared system. However, we observe that, in cloud computing environment, the fair resource allocation policy implemented in YARN is not suitable because of its memoryless resource allocation fashion leading to violations of a number of good properties in shared computing systems. This paper attempts to address these problems for YARN. Both single-level and hierarchical resource allocations are considered. For single-level resource allocation, we propose a novel fair resource allocation mechanism called Long-Term Resource Fairness (LTRF)for such computing. For hierarchical resource allocation, we propose Hierarchical Long-Term Resource Fairness (H-LTRF) by extending LTRF. We show that both LTRF and H-LTRF can address these fairness problems of current resource allocation policy and are thus suitable for cloud computing. Finally, we have developed LTYARN by implementing LTRF and H-LTRF in YARN, and our experiments show that it leads to a better resource fairness than existing fair schedulers of YARN.
30Data Security and Privacy-Preserving in Edge Computing Paradigm: Survey and Open Issues
With the explosive growth of Internet of Things devices and massive data produced at the edge of the network, the traditional centralized cloud computing model has come to a bottleneck due to the bandwidth limitation and resources constraint. Therefore, edge computing, which enables storing and processing data at the edge of the network, has emerged as a promising technology in recent years. However, the unique features of edge computing, such as content perception, real-time computing, and parallel processing, has also introduced several new challenges in the field of data security and privacy-preserving, which are also the key concerns of the other prevailing computing paradigms, such as cloud computing, mobile cloud computing, and fog computing. Despites its importance, there still lacks a survey on the recent research advance of data security and privacy-preserving in the field of edge computing. In this paper, we present a comprehensive analysis of the data security and privacy threats, protection technologies, and countermeasures inherent in edge computing. Specifically, we first make an overview of edge computing, including forming factors, definition, architecture, and several essential applications. Next, a detailed analysis of data security and privacy requirements, challenges, and mechanisms in edge computing are presented. Then, the cryptography-based technologies for solving data security and privacy issues are summarized. The state-of-the-art data security and privacy solutions in edge-related paradigms are also surveyed.
31Capacity Optimization for Resource Pooling in Virtualized Data Centers with Composable Systems
Recent research trends exhibit a growing imbalance between the demands of tenants' software applications and the provisioning of hardware resources. Misalignment of demand and supply gradually hinders workloads from being efficiently mapped to fixed-sized server nodes in traditional data centers. The incurred resource holes not only lower infrastructure utilization but also cripple the capability of a data center for hosting large-sized workloads. This deficiency motivates the development of a new rack-wide architecture referred to as the composable system. The composable system transforms traditional server racks of static capacity into a dynamic compute platform. Specifically, this novel architecture aims to link up all compute components that are traditionally distributed on traditional server boards, such as central processing unit (CPU), random access memory (RAM), storage devices, and other application-specific processors. By doing so, a logically giant compute platform is created and this platform is more resistant against the variety of workload demands by breaking the resource boundaries among traditional server boards. In this paper, we introduce the concepts of this reconfigurable architecture and design a framework of the composable system for cloud data centers.
32Confluence: Speeding Up Iterative Distributed Operations by Key-Dependency-Aware Partitioning
A typical shuffle operation randomly partitions data on many computers, generating possibly a significant amount of network traffic which often dominates a job's completion time. This traffic is particularly pronounced in iterative distributed operations where each iteration invokes a shuffle operation. We observe that data of different iterations are related according to the transformation logic of distributed operations. If data generated by the current iteration are partitioned to the computers where they will be processed in the next iteration, unnecessary shuffle network traffic between the two iterations can be prevented. We model general iterative distributed operations as the transform-and-shuffle primitive and define a powerful notion named Confluence key dependency to precisely capture the data relations in the primitive. We further find that by binding key partitions between different iterations based on the Confluence key dependency, the shuffle network traffic can always be reduced by a predictable percentage. We implemented the Confluence system. Confluence provides a simple interface for programmers to express the Confluence key dependency, based on which Confluence automatically generates efficient key partitioning schemes. Evaluation results on diverse real-life applications show that Confluence greatly reduces the shuffle network traffic, resulting in as much as 23 percent job completion time reduction.
33Toward High Mobile GPU Performance through Collaborative Workload Offloading
The ever increasing of display resolution on mobile devices raises high demand for GPU rendering details. However, the challenge of poor hardware support but fine-grained rendering details often makes user unsatisfied especially in calling for high frame rate scenarios, e.g., game. To resolve such issue, we propose ButterFly, a novel system which collaboratively utilizes mobile GPUs to process high-quality rendering details for on-the-go mobile users. In particular, ButterFly achieves three technical contributions for the collaborative design: (1) a mobile device can migrate GPU workloads in buffer queue to peers, (2) the collaborative rendering mechanism benefits user high quality details while significant power saving performance, and (3) unnecessary 3D texture rendering can be clipped for further optimization. All the techniques are compatible with the OpenGL ES standards. Furthermore, a 40-person survey perceives that ButterFly can provide excellent user experience of both rendering details and frame rate over Wi-Fi network. In addition, our comprehensive trace-driven experiments on Android prototype reveal the benefits of Butterfly have more superior performance over state-of-the-art systems, which achieves more than 28.3 percent power saving.
34Towards Bandwidth Guarantee for Virtual Clusters under Demand Uncertainty in Multi-Tenant Clouds
In the cloud, multiple tenants share the resource of datacenters and their applications compete with each other for scarce network bandwidth. Current studies have shown that the lack of bandwidth guarantee causes unpredictable network performance, leading to poor application performance. To address this issue, several virtual network abstractions have been proposed which allow the tenants to reserve virtual clusters with specified bandwidth between the Virtual Machines (VMs) in the datacenters. However, all these existing proposals require the tenants to deterministically characterize the bandwidth demands in the abstractions, which can be difficult and result in inefficient bandwidth reservation due to the demand uncertainty. In this paper, we explore a virtual cluster abstraction with stochastic bandwidth characterization to address the bandwidth demand uncertainty. We propose Stochastic Virtual Cluster (SVC), which models the bandwidth demand between VMs in a probabilistic way. Based on SVC, we develop a stochastic framework for virtual cluster allocation, in which the admitted virtual cluster's bandwidth demands are satisfied with a high probability. Efficient VM allocation algorithms are proposed to implement the framework while reducing the possibility of link congestion through minimizing the maximum bandwidth occupancy of a virtual cluster on physical links. Using simulations.
35Optimization of Error-Bounded Lossy Compression for Hard-to-Compress HPC Data
Since today's scientific applications are producing vast amounts of data, compressing them before storage/transmission is critical. Results of existing compressors show two types of HPC data sets: highly compressible and hard to compress. In this work, we carefully design and optimize the error-bounded lossy compression for hard-to-compress scientific data. We propose an optimized algorithm that can adaptively partition the HPC data into best-fit consecutive segments each having mutually close data values, such that the compression condition can be optimized. Another significant contribution is the optimization of shifting offset such that the XOR-leading-zero length between two consecutive unpredictable data points can be maximized. We finally devise an adaptive method to select the best-fit compressor at runtime for maximizing the compression factor. We evaluate our solution using 13 benchmarks based on real-world scientific problems, and we compare it with 9 other state-of-the-art compressors. Experiments show that our compressor can always guarantee the compression errors within the user-specified error bounds. Most importantly, our optimization can improve the compression factor effectively, by up to 49 percent for hard-to-compress data sets with similar compression/ decompression time cost.
36VOLAP: A Scalable Distributed Real-Time OLAP System for High-Velocity Data
This paper presents VelocityOLAP (VOLAP), a distributed real-time OLAP system for high-velocity data. VOLAP makes use of dimension hierarchies, is highly scalable, exploits both multi-core and multi-processor parallelism, and can guarantee serializable execution of insert and query operations. In contrast to other high performance OLAP systems such as SAP HANA or IBM Netezza that rely on vertical scaling or special purpose hardware, VOLAP supports cost-efficient horizontal scaling on commodity hardware or modest cloud instances. Experiments on 20 Amazon EC2 nodes with TPC-DS data show that VOLAP is capable of bulk ingesting data at over 600 thousand items per second, and processing streams of interspersed insertions and aggregate queries at a rate of approximately 50 thousand insertions and 20 thousand aggregate queries per second with a database of 1 billion items. VOLAP is designed to support applications that perform large aggregate queries, and provides similar high performance for aggregations ranging from a few items to nearly the entire database.
37Time- and Cost- Efficient Task Scheduling across Geo-Distributed Data Centers
Typically called big data processing, analyzing large volumes of data from geographically distributed regions with machine learning algorithms has emerged as an important analytical tool for governments and multinational corporations. The traditional wisdom calls for the collection of all the data across the world to a central data center location, to be processed using data-parallel applications. This is neither efficient nor practical as the volume of data grows exponentially. Rather than transferring data, we believe that computation tasks should be scheduled near the data, while data should be processed with a minimum amount of transfers across data centers. In this paper, we design and implement Flutter, a new task scheduling algorithm that reduces both the completion times and the network costs of big data processing jobs across geographically distributed data centers. To cater to the specific characteristics of data-parallel applications, in the case of optimizing the job completion times only, we first formulate our problem as a lexicographical min-max integer linear programming (ILP) problem, and then transform the ILP problem into a nonlinear program problem with a separable convex objective function and a totally unimodular constraint matrix, which can be further solved using a standard linear programming solver efficiently in an online fashion. In the case of improving both time-and costefficiency, we formulate the general problem as an ILP problem and we find out that solving an LP problem can achieve the same goal in the real practice.
38Distributed Privacy-Aware Fast Selection Algorithm for Large-Scale Data
Finding the k smallest/largest element of a large array, i.e., k-selection is a fundamental supporting algorithm in data analysis. Due to the fact that big data born in geo-distributed environments, it especially requires communication-efficient distributed k-selection, besides typical computation and memory efficiency. Moreover, sensitive organizations make data privacy a rigorous precondition for their participation in such distributed statistical analysis for common profit. To this end, we propose a Distributed Privacy-Aware Median (DPAM) selection algorithm for median selection in distributed large-scale data while preserving local statistics privacy, and extend it to arbitrary k-selection. DPAM utilizes mean to approximate median, via contraction of the standard deviation. It is the theoretical fastest with a worst computation complexity of O(N), and also highly efficient in communication overhead (in logarithm of data range). To preserve ε-differential privacy of local statistics, DPAM randomly adds dummy elements (the number follows a rounded Laplacian distribution) to local data. The noise does not degrade the estimation precision or convergence rate. Performance of DPAM is compared with centralized/distributed quick select and optimization, in terms of complexity and privacy preserving ability. Extensive simulation and experiment results show the higher efficiency of DPAM.

39Auditing Big Data Storage in Cloud Computing Using Divide and Conquer Tables
Cloud computing has arisen as the mainstream platform of utility computing paradigm that offers reliable and robust infrastructure for storing data remotely, and provides on demand applications and services. Currently, establishments that produce huge volume of sensitive data, leverage data outsourcing to reduce the burden of local data storage and maintenance. The outsourced data, however, in the cloud are not always trustworthy because of the inadequacy of physical control over the data for data owners. To better streamline this issue, scientists have now focused on relieving the security threats by designing remote data checking (RDC) techniques. However, the majority of these techniques are inapplicable to big data storage due to incurring huge computation cost on the user and cloud sides. Such schemes in existence suffer from data dynamicity problem from two sides. First, they are only applicable for static archive data and are not subject to audit the dynamic outsourced data. Second, although, some of the existence methods are able to support dynamic data update, increasing the number of update operations impose high computation and communication cost on the auditor due to maintenance of data structure, i.e., merkle hash tree. This paper presents an efficient RDC method on the basis of algebraic properties of the outsourced files in cloud computing, which inflicts the least computation and communication cost. Of the computation and communication cost on the auditor and cloud.
40Light Weight Write Mechanism for Cloud Data
Outsourcing data to the cloud for computation and storage has been on the rise in recent years. In this paper we investigate the problem of supporting write operation on the outsourced data for clients using mobile devices. We consider the Ciphertext-Policy Attribute-based Encryption (CP-ABE) scheme as it is well suited to support access control in outsourced cloud environments. One shortcoming of CP-ABE is that users can modify the access policy specified by the data owner if write operations are incorporated in the scheme. We propose a protocol for collaborative processing of outsourced data that enables the authorized users to perform write operation without being able to alter the access policy specified by the data owner. Our scheme is accompanied with a light weight signature scheme and simple, inexpensive user revocation mechanism to make it suitable for processing on resource-constrained mobile devices. The implementation and detailed performance analysis of the scheme indicate the suitability of the proposed scheme for real mobile applications. Moreover, the security analysis demonstrates that the security properties of the system are not compromised.
41A Hierarchical RAID Architecture towards Fast Recovery and High Reliability
Disk failures are very common in modern storage systems due to the large number of inexpensive disks. As a result, it takes a long time to recover a failed disk due to its large capacity and limited I/O. To speed up the recovery process and maintain a high system reliability, we propose a hierarchical code architecture with erasure codes, OI-RAID, which consists of two layers of codes, outer layer code and inner layer code. Specifically, the outer layer code is deployed with disk grouping technique based on Balanced Incomplete Block Design (BIBD) or complete graph with skewed data layout to provide efficient parallel I/O of all disks for fast failure recovery, and the inner layer code is deployed within each group of disks to provide high reliability. As an example, we deploy RAID5 in both layers to achieve fault tolerance of at least three disk failures, which meets the requirement of data availability in practical systems, as well as much higher speed up ratio for disk failure recovery than existing approaches. Besides, OI-RAID also keeps the optimal data update complexity and incurs low storage overhead in practice.
42Minimize the Make-span of Batched Requests for FPGA Pooling in Cloud Computing
Using FPGA as accelerators is gaining popularity in Cloud computing. Usually, FPGA accelerators in a datacenter are managed as a single resource pool. By issuing a request to this pool, a tenant can transparently access FPGA resources. FPGA requests usually arrive in batches. The objective of scheduling is to minimize the make-span of a given batch of requests. As a result, either the responsiveness is improved, or the system throughput is maximized. The key technical challenge is the existence of multiple resource bottlenecks. An FPGA job can be bottlenecked by either computation (i.e., computation-intensive) or network (i.e., network-intensive), and sometimes by both. To our best knowledge, this is the first work that minimizes the make-span of batched requests for an FPGA accelerator pool in Cloud computing that considers multiple resource bottlenecks. In this paper, we design several scheduling algorithms to address the challenge. We implement our scheduling algorithms in IBM's Cloud system. We conduct extensive evaluations on both a small scale testbed and a large-scale simulation. Compared with the Shortest-Job-First scheduling, our algorithms can reduce the make-span by 36.25%, and improve the system throughput by 36.05%.
43Data Mining in Sports: A Systematic Review
Data mining technique has attracted attention in the information industry and society as a whole, because of the big amount of data and the imminent need to transform that data into useful information and knowledge. Recently conducted studies with successfully demarcated results using this technique, to estimate several parameters in a variety of domains. However, the effective use of data in some areas is still developing, as is the case of sports, which has shown moderate growth. In this context, the objective of this article is to present a systematic review of the literature about research involving sports data mining. As systematic searches were made out in five databases, resulting in 21 articles that answered a question that grounded this article.
44Joint Scheduling and Source Selection for Background Traffic in Erasure-Coded Storage
Erasure-coded storage systems have gained considerable adoption recently since they can provide the same level of reliability with significantly lower storage overhead compared to replicate systems. However, background traffic of such systems — e.g. repair, rebalance, backup and recovery traffic — often has large volume and consumes significant network resources. Independently scheduling such tasks and selecting their sources can easily create interference among data flows, causing severe deadline violation. We show that the well-known heuristic scheduling algorithms fail to consider important constraints, thus resulting in unsatisfactory performance. In this paper, we claim that an optimal scheduling algorithm, which aims to maximize the number of background tasks completed before deadlines, must simultaneously consider task deadline, network topology, chunk placement, and time-varying resource availability. We first show that the corresponding optimization problem is NP-hard. Then we propose a novel algorithm, called Linear Programming for Selected Tasks (LPST) to maximize the number of successful tasks and improve overall utilization of the datacenter network. It jointly schedules tasks and selects their sources based on a notion of Remaining Time Flexibility, which measures the slackness of the starting time of a task. We evaluated the efficacy of our algorithm using extensive simulations and validate the results with experiments in a real cloud environment.
45SMGuard: A Flexible and fine-grained resource management framework for GPUs
GPUs have been becoming an indispensable computing platform in data centers, and co-locating multiple applications on the same GPU is widely used to improve resource utilization. However, performance interference due to uncontrolled resource contention severely degrades the performance of co-locating applications and fails to deliver satisfactory user experience. In this paper, we present SMGuard, a software approach to flexibly manage the GPU resource usage of multiple applications under co-location. We also propose a capacity based GPU resource model CapSM, which provisions the GPU resource in a fine-grained granularity among co-locating applications. When co-locating latency-sensitive applications with batch applications, SMGuard can prevent batch applications from occupying resource without constraint using quota based mechanism, and guarantee the resource usage of latency-sensitive applications with reservation based mechanism. In addition, SMGuard supports dynamic resource adjustment through evicting the running thread blocks of batch applications to release the occupied resource and remapping the uncompleted thread blocks to the remaining resource. The SMGuard is a pure software solution that does not rely on special GPU architecture or programming model. Our evaluation shows that SMGuard improves the average performance of latency-sensitive applications by 9.8x.
46M-Oscillating: Performance Maximization on Temperature-Constrained Multi-Core Processors
The ever-increasing computational demand drives modern electronic devices to integrate more processing elements for pursuing higher computing performance. However, the resulting soaring power density and potential thermal crisis constrain the system performance under a maximally allowed temperature. This paper analytically studies the throughput maximization problem of multi-core platforms under the peak temperature constraints. To take advantage of thermal heterogeneity of different cores for performance improvement, we propose to run each core with multiple speed levels and develop a schedule based on two novel concepts, i.e. the step-up schedule and the m-Oscillating schedule, for multi-core platforms. The proposed methodology can ensure the peak temperature guarantee with a significant improvement in computing throughput up to 89%, with an average improvement of 11%. Meanwhile, the computational time reduces orders of magnitude compared to the traditional exhaustive search-based approach.
47Scalable Data Race Detection for Lock-intensive Programs with Pending Period Representation
Most of dynamic data race detection essentially relies on the underlying happens-before orders to yield the precise reports. They are notoriously prone to a prohibitively basic overhead. Although there exist a wealth of research advances that succeed in significantly reducing the analysis overhead on memory accesses, there remains an open problem in handling a great deal of fundamentally unscalable synchronization overhead, which can be particularly serious for the large, lock-intensive programs with a long running time and a large number of threads. In this paper, we revisit the synchronization problem of off-the-shelf race detection with a comprehensive study. The key insight of this work is that a full collection of partial orders for synchronization operations in prior work is not necessarily tracked and analyzed from a new perspective of "global clock" representation. We therefore develop this insight into a novel pending-period based approach, aiming at reducing the overhead of monitoring and analysis on unnecessary synchronization operations. Further, we also enable a significant improvement for enhancing the efficiency of existing sampling techniques, in which synchronization operations are often conservatively identified.
48Core Maintenance in Dynamic Graphs: A Parallel Approach based on Matching
The core number of vertices is a basic index depicting cohesiveness of a graph, and has been widely used in large-scale graph analytics. In this paper, we study the update of core numbers of vertices in dynamic graphs with edge insertions/deletions, which is known as the core maintenance problem. Different from previous approaches that just focus on the case of single-edge insertion/deletion and sequentially handle the edges when multiple edges are inserted/deleted, we investigate the parallelism in the core maintenance procedure. Specifically, we show that if the inserted/deleted edges constitute a matching, the core number update with respect to each inserted/deleted edge can be handled in parallel. Based on this key observation, we propose parallel algorithms for core maintenance in both cases of edge insertions and deletions. Extensive experiments are conducted to evaluate the efficiency, stability, parallelism and scalability of our algorithms on different types of real-world, synthetic graphs and temporal networks. Comparing with former approaches, our algorithms can improve the core maintenance efficiency significantly.
49MPCA SGD - A method for distributed training of deep learning models on Spark
Many distributed deep learning systems have been published over the past few years, often accompanied by impressive performance claims. In practice these figures are often achieved in high performance computing (HPC) environments with fast InfiniBand network connections. For average deep learning practitioners this is usually an unrealistic scenario, since they cannot afford access to these facilities. Simple re-implementations of algorithms such as EASGD [1] for standard Ethernet environments often fail to replicate the scalability and performance of the original works [2]. In this paper, we explore this particular problem domain and present MPCA SGD, a method for distributed training of deep neural networks that is specifically designed to run in low-budget environments. MPCA SGD tries to make the best possible use of available resources, and can operate well if network bandwidth is constrained. Furthermore, MPCA SGD runs on top of the popular Apache Spark [3] framework. Thus, it can easily be deployed in existing data centers and office environments where Spark is already used. When training large deep learning models in a gigabit Ethernet cluster, MPCA SGD achieves significantly faster convergence rates than many popular alternatives.

Network Security projects

Network Security Projects in like manner Competitive Field Such as Ethical hacking, Telecommunication, Networking. Forasmuch as leading Companies spending huge amounts for building and deploying specialized products that particularly resisting against the external foreign hackers, and competitors. For this reasons security domains considers predominant ever. Accordingly every network security professional owning prestigious responses like protecting confidential information against hacking, spoofing, hijacking of server data. Due to above challenging factor security remains highly indispensable to every domains as well as Network security professional too.

Network Security Attacks Possibilities

Eavesdropping, Denial of Service attack, Distributed Denial of Service attack, Password attack, Compromised- key attack, Man- In-the-middle attack, IP Spoofing, Application-layer attacks, Exploit attacks.

Hi there! Click one of our representatives below and we will get back to you as soon as possible.

Chat with us on WhatsApp
Online Payment
LiveZilla Live Chat Software