scispace - formally typeset
Search or ask a question

Showing papers on "Encoding (memory) published in 2017"


Journal ArticleDOI
TL;DR: A fundamentally different approach is needed, in which the cache contents are used as side information for coded communication over the shared link, and it is proposed and proved that it is close to optimal.
Abstract: We consider a network consisting of a file server connected through a shared link to a number of users, each equipped with a cache. Knowing the popularity distribution of the files, the goal is to optimally populate the caches, such as to minimize the expected load of the shared link. For a single cache, it is well known that storing the most popular files is optimal in this setting. However, we show here that this is no longer the case for multiple caches. Indeed, caching only the most popular files can be highly suboptimal. Instead, a fundamentally different approach is needed, in which the cache contents are used as side information for coded communication over the shared link. We propose such a coded caching scheme and prove that it is close to optimal.

224 citations


Proceedings ArticleDOI
01 Aug 2017
TL;DR: This work proposes a deep neural network for the purpose of recognizing violent videos that uses adjacent frame differences as the input to the model thereby forcing it to encode the changes occurring in the video.
Abstract: Developing a technique for the automatic analysis of surveillance videos in order to identify the presence of violence is of broad interest. In this work, we propose a deep neural network for the purpose of recognizing violent videos. A convolutional neural network is used to extract frame level features from a video. The frame level features are then aggregated using a variant of the long short term memory that uses convolutional gates. The convolutional neural network along with the convolutional long short term memory is capable of capturing localized spatio-temporal features which enables the analysis of local motion taking place in the video. We also propose to use adjacent frame differences as the input to the model thereby forcing it to encode the changes occurring in the video. The performance of the proposed feature extraction pipeline is evaluated on three standard benchmark datasets in terms of recognition accuracy. Comparison of the results obtained with the state of the art techniques revealed the promising capability of the proposed method in recognizing violent videos.

162 citations


Proceedings ArticleDOI
05 Mar 2017
TL;DR: This work explores how logarithmic encoding of non-uniformly distributed weights and activations is preferred over linear encoding at resolutions of 4 bits and less and enables networks to achieve higher classification accuracies than fixed-point at low resolutions and eliminate bulky digital multipliers.
Abstract: We present the concept of logarithmic computation for neural networks. We explore how logarithmic encoding of non-uniformly distributed weights and activations is preferred over linear encoding at resolutions of 4 bits and less. Logarithmic encoding enables networks to 1) achieve higher classification accuracies than fixed-point at low resolutions and 2) eliminate bulky digital multipliers. We demonstrate our ideas in the hardware realization, LogNet, an inference engine using only bitshift-add convolutions and weights distributed across the computing fabric. The opportunities from hardware work in synergy with those from the algorithm domain.

148 citations


Journal ArticleDOI
TL;DR: It is proposed that robust sustained activity that can support WM coding arises as a property of association cortices downstream from the early stages of sensory processing.

129 citations


Journal ArticleDOI
TL;DR: The authors' develop a digital watermarking algorithm based on a fractal encoding method and the discrete cosine transform (DCT) method that has higher performance characteristics such as robustness and peak signal to noise ratio than classical methods.
Abstract: With the rapid development of computer science, problems with digital products piracy and copyright dispute become more serious; therefore, it is an urgent task to find solutions for these problems. In this study, the authors' develop a digital watermarking algorithm based on a fractal encoding method and the discrete cosine transform (DCT). The proposed method combines fractal encoding method and DCT method for double encryptions to improve traditional DCT method. The image is encoded by fractal encoding as the first encryption, and then encoded parameters are used in DCT method as the second encryption. First, the fractal encoding method is adopted to encode a private image with private scales. Encoding parameters are applied as digital watermarking. Then, digital watermarking is added to the original image to reversibly using DCT, which means the authors can extract the private image from the carrier image with private encoding scales. Finally, attacking experiments are carried out on the carrier image by using several attacking methods. Experimental results show that the presented method has higher performance characteristics such as robustness and peak signal to noise ratio than classical methods.

114 citations


Journal ArticleDOI
TL;DR: High‐intensity exercise prior to memory encoding (vs. exercise during memory encoding or consolidation) was effective in enhancing long‐term memory (for both 20‐min and 24‐h follow‐up assessments), and the timing of high‐ intensity exercise may play an important role in facilitating long-term memory.
Abstract: The broader purpose of this study was to examine the temporal effects of high-intensity exercise on learning, short-term and long-term retrospective memory and prospective memory. Among a sample of 88 young adult participants, 22 were randomized into one of four different groups: exercise before learning, control group, exercise during learning, and exercise after learning. The retrospective assessments (learning, short-term and long-term memory) were assessed using the Rey Auditory Verbal Learning Test. Long-term memory including a 20-min and 24-hr follow-up assessment. Prospective memory was assessed using a time-based procedure by having participants contact (via phone) the researchers at a follow-up time period. The exercise stimulus included a 15-min bout of progressive maximal exertion treadmill exercise. High-intensity exercise prior to memory encoding (vs. exercise during memory encoding or consolidation) was effective in enhancing long-term memory (for both 20-min and 24-h follow-up assessments). We did not observe a differential temporal effect of high-intensity exercise on short-term memory (immediate post-memory encoding), learning or prospective memory. The timing of high-intensity exercise may play an important role in facilitating long-term memory.

114 citations


Posted Content
TL;DR: This paper proposes a neural language model with a key-value attention mechanism that outputs separate representations for the key and value of a differentiable memory, as well as for encoding the next-word distribution that outperforms existing memory-augmented neural language models on two corpora.
Abstract: Neural language models predict the next token using a latent representation of the immediate token history. Recently, various methods for augmenting neural language models with an attention mechanism over a differentiable memory have been proposed. For predicting the next token, these models query information from a memory of the recent history which can facilitate learning mid- and long-range dependencies. However, conventional attention mechanisms used in memory-augmented neural language models produce a single output vector per time step. This vector is used both for predicting the next token as well as for the key and value of a differentiable memory of a token history. In this paper, we propose a neural language model with a key-value attention mechanism that outputs separate representations for the key and value of a differentiable memory, as well as for encoding the next-word distribution. This model outperforms existing memory-augmented neural language models on two corpora. Yet, we found that our method mainly utilizes a memory of the five most recent output representations. This led to the unexpected main finding that a much simpler model based only on the concatenation of recent output representations from previous time steps is on par with more sophisticated memory-augmented neural language models.

110 citations


Journal ArticleDOI
TL;DR: The role of prediction errors (PE) in human one-shot declarative learning is manipulated via previous experiences and sensory inputs to lead to superior memory across 5 different experiments.

105 citations


Journal ArticleDOI
TL;DR: Investigation of the influence of reward motivation on retroactive memory enhancement selectively for conceptually related information found behavioral evidence that reward retroactively enhances memory at a 24-h memory test, but not at an immediate memorytest, suggesting a role for post-encoding mechanisms of consolidation.
Abstract: Reward motivation has been shown to modulate episodic memory processes in order to support future adaptive behavior. However, for a memory system to be truly adaptive, it should enhance memory for rewarded events as well as for neutral events that may seem inconsequential at the time of encoding but can gain importance later. Here, we investigated the influence of reward motivation on retroactive memory enhancement selectively for conceptually related information. We found behavioral evidence that reward retroactively enhances memory at a 24-h memory test, but not at an immediate memory test, suggesting a role for post-encoding mechanisms of consolidation.

98 citations


Journal ArticleDOI
TL;DR: This paper showed that the detection of event boundaries triggered a rapid memory reinstatement of the just-encoded sequence episode and was specific to context shifts that were preceded by an event sequence with episodic content.

84 citations


Journal ArticleDOI
TL;DR: The results suggest that memory deficits in adult ADHD reflect a learning deficit induced at the stage of encoding, which is strongly statistically related to the memory acquisition deficit.
Abstract: Objective: Memory problems are a frequently reported symptom in adult ADHD, and it is well-documented that adults with ADHD perform poorly on long-term memory tests. However, the cause of this effect is still controversial. The present meta-analysis examined underlying mechanisms that may lead to long-term memory impairments in adult ADHD. Method: We performed separate meta-analyses of measures of memory acquisition and long-term memory using both verbal and visual memory tests. In addition, the influence of potential moderator variables was examined. Results: Adults with ADHD performed significantly worse than controls on verbal but not on visual long-term memory and memory acquisition subtests. The long-term memory deficit was strongly statistically related to the memory acquisition deficit. In contrast, no retrieval problems were observable. Conclusion: Our results suggest that memory deficits in adult ADHD reflect a learning deficit induced at the stage of encoding. Implications for clinical and resea...

Journal ArticleDOI
TL;DR: These findings should stimulate a revisitation of the neural streams dedicated to perception and memory, with the MTL determining stimulus statistics and distinctiveness to support later memory encoding, and the PFC comparing stimuli to specific individual memories.

Journal ArticleDOI
TL;DR: The FCA based on BAM is extended to three-way formal concept analysis (3WFCA) to achieve a more precise recall and an extra operator namely negative operator is added to achieve this objective.
Abstract: Human brain represents the information and stores it as memory. They are stored in different parts of the brain and are linked together by associations. When a cue is provided, the memory is recalled through association. Encoding of the real world information is in the form of object-attribute relation. It is possible to perform both positive recall (object having the attribute and attribute shared by object) and negative recalls (object not having the attribute and attribute not shared by object) from memory. It is evident from literature that the formal concept analysis (FCA) based on bidirectional associative memory (BAM) performs only positive recall from memory. In this paper, FCA based on BAM is extended to three-way formal concept analysis (3WFCA) to achieve a more precise recall. In this extended model, both positive recall and negative recall are performed. In order to achieve this objective, an extra operator namely negative operator is added. The proposed model is validated with an experiment on real world scenario. We also presented the connection of the proposal with long term potentiation (LTP) and Hippocampus of the human brain.

Journal ArticleDOI
TL;DR: This study provides the first evidence that manipulating event segmentation affects memory over long delays and that individual differences inevent segmentation are related to differences in memory overLong delays.
Abstract: When people observe everyday activity, they spontaneously parse it into discrete meaningful events. Individuals who segment activity in a more normative fashion show better subsequent memory for the events. If segmenting events effectively leads to better memory, does asking people to attend to segmentation improve subsequent memory? To answer this question, participants viewed movies of naturalistic activity with instructions to remember the activity for a later test, and in some conditions additionally pressed a button to segment the movies into meaningful events or performed a control condition that required button-pressing but not attending to segmentation. In 5 experiments, memory for the movies was assessed at intervals ranging from immediately following viewing to 1 month later. Performing the event segmentation task led to superior memory at delays ranging from 10 min to 1 month. Further, individual differences in segmentation ability predicted individual differences in memory performance for up to a month following encoding. This study provides the first evidence that manipulating event segmentation affects memory over long delays and that individual differences in event segmentation are related to differences in memory over long delays. These effects suggest that attending to how an activity breaks down into meaningful events contributes to memory formation. Instructing people to more effectively segment events may serve as a potential intervention to alleviate everyday memory complaints in aging and clinical populations. (PsycINFO Database Record

Journal ArticleDOI
TL;DR: It is found that sleep has a protective effect on explicitly learned associations and the need for sleep‐mediated consolidation depends on the strategy used for learning and might be related to the level of integration of newly acquired memory achieved during encoding.

Journal ArticleDOI
TL;DR: An overview of the key findings related to D1/D5 receptor-dependent persistence of synaptic plasticity and memory in HPC is provided, especially focusing on the emerging evidence for a role of the locus coeruleus (LC) in DA-dependent memory consolidation.
Abstract: Most everyday memories including many episodic-like memories that we may form automatically in the hippocampus (HPC) are forgotten, while some of them are retained for a long time by a memory stabilization process, called initial memory consolidation. Specifically, the retention of everyday memory is enhanced, in humans and animals, when something novel happens shortly before or after the time of encoding. Converging evidence has indicated that dopamine (DA) signaling via D1/D5 receptors in HPC is required for persistence of synaptic plasticity and memory, thereby playing an important role in the novelty-associated memory enhancement. In this review paper, we aim to provide an overview of the key findings related to D1/D5 receptor-dependent persistence of synaptic plasticity and memory in HPC, especially focusing on the emerging evidence for a role of the locus coeruleus (LC) in DA-dependent memory consolidation. We then refer to candidate brain areas and circuits that might be responsible for detection and transmission of the environmental novelty signal and molecular and anatomical evidence for the LC-DA system. We also discuss molecular mechanisms that might mediate the environmental novelty-associated memory enhancement, including plasticity-related proteins that are involved in initial memory consolidation processes in HPC.

Journal ArticleDOI
TL;DR: Results provide the necessary demonstrations to further the feasibility of the MIMO model as a memory prosthesis to recover and/or enhance encoding of cognitive information in humans with memory disruptions resulting from brain injury, disease or aging.

Proceedings ArticleDOI
01 Apr 2017
TL;DR: This framework employs a reconfigurable clustering approach that encodes the parameters of deep neural networks in accordance with the application's accuracy requirement and the underlying platform constraints, increasing the effective throughput of FPGA-based realizations.
Abstract: We propose a novel end-to-end framework to customize execution of deep neural networks on FPGA platforms. Our framework employs a reconfigurable clustering approach that encodes the parameters of deep neural networks in accordance with the application's accuracy requirement and the underlying platform constraints. The throughput of FPGA-based realizations of neural networks is often bounded by the memory access bandwidth. The use of encoded parameters reduces both the required memory bandwidth and the computational complexity of neural networks, increasing the effective throughput. Our framework enables systematic customization of encoded deep neural networks for different FPGA platforms. Proof-of-concept evaluations on four different applications demonstrate up to 9-fold reduction in memory footprint and 15-fold improvement in the operational throughput while the drop in accuracy remains below 0.1%.


Proceedings ArticleDOI
01 Oct 2017
TL;DR: This paper proposes a solution for group re-identification that grounds on transferring knowledge from single person reidentification to group re -identification by exploiting sparse dictionary learning, and shows that the proposed solution outperforms state of the art approaches.
Abstract: Person re-identification is best known as the problem of associating a single person that is observed from one or more disjoint cameras. The existing literature has mainly addressed such an issue, neglecting the fact that people usually move in groups, like in crowded scenarios. We believe that the additional information carried by neighboring individuals provides a relevant visual context that can be exploited to obtain a more robust match of single persons within the group. Despite this, re-identifying groups of people compound the common single person re-identification problems by introducing changes in the relative position of persons within the group and severe self-occlusions. In this paper, we propose a solution for group re-identification that grounds on transferring knowledge from single person reidentification to group re-identification by exploiting sparse dictionary learning. First, a dictionary of sparse atoms is learned using patches extracted from single person images. Then, the learned dictionary is exploited to obtain a sparsity-driven residual group representation, which is finally matched to perform the re-identification. Extensive experiments on the i-LIDS groups and two newly collected datasets show that the proposed solution outperforms stateof-the-art approaches.

Journal ArticleDOI
TL;DR: The dependency-based Bi-LSTM model can learn effective relation information with less feature engineering in the task of DDI extraction and achieves new state-of-the-art performance.
Abstract: Drug-drug interaction extraction (DDI) needs assistance from automated methods to address the explosively increasing biomedical texts. In recent years, deep neural network based models have been developed to address such needs and they have made significant progress in relation identification. We propose a dependency-based deep neural network model for DDI extraction. By introducing the dependency-based technique to a bi-directional long short term memory network (Bi-LSTM), we build three channels, namely, Linear channel, DFS channel and BFS channel. All of these channels are constructed with three network layers, including embedding layer, LSTM layer and max pooling layer from bottom up. In the embedding layer, we extract two types of features, one is distance-based feature and another is dependency-based feature. In the LSTM layer, a Bi-LSTM is instituted in each channel to better capture relation information. Then max pooling is used to get optimal features from the entire encoding sequential data. At last, we concatenate the outputs of all channels and then link it to the softmax layer for relation identification. To the best of our knowledge, our model achieves new state-of-the-art performance with the F-score of 72.0% on the DDIExtraction 2013 corpus. Moreover, our approach obtains much higher Recall value compared to the existing methods. The dependency-based Bi-LSTM model can learn effective relation information with less feature engineering in the task of DDI extraction. Besides, the experimental results show that our model excels at balancing the Precision and Recall values.

Proceedings ArticleDOI
01 May 2017
TL;DR: A novel method is proposed in which encoding modes, e.g. coding block structure, prediction types and motion vectors, are selected basing on noise-reduced version of the input sequence, while the content is coded based on the unaltered input sequence.
Abstract: This paper concerns optimization of encoding in HEVC. A novel method is proposed in which encoding modes, e.g. coding block structure, prediction types and motion vectors, are selected basing on noise-reduced version of the input sequence, while the content, e.g. transform coefficients, are coded basing on the unaltered input sequence. Although the proposed scheme involves encoding of two versions of the input sequence, the proposed realization ensures that the complexity is only negligibly larger than complexity of a single encoder. The proposal has been implemented and assessed. The experimental results show that the proposal provides up to 1.5% bitrate reduction while preserving the same video quality.

Proceedings ArticleDOI
19 Jul 2017
TL;DR: A novel integrated deep architecture is developed to effectively encode the detailed semantics of informative images and long descriptive sentences, named as Textual-Visual Deep Binaries (TVDB), where region-based convolutional networks with long short-term memory units are introduced to fully explore image regional details while semantic cues of sentences are modeled by a text Convolutional network.
Abstract: Cross-modal hashing is usually regarded as an effective technique for large-scale textual-visual cross retrieval, where data from different modalities are mapped into a shared Hamming space for matching. Most of the traditional textual-visual binary encoding methods only consider holistic image representations and fail to model descriptive sentences. This renders existing methods inappropriate to handle the rich semantics of informative cross-modal data for quality textual-visual search tasks. To address the problem of hashing cross-modal data with semantic-rich cues, in this paper, a novel integrated deep architecture is developed to effectively encode the detailed semantics of informative images and long descriptive sentences, named as Textual-Visual Deep Binaries (TVDB). In particular, region-based convolutional networks with long short-term memory units are introduced to fully explore image regional details while semantic cues of sentences are modeled by a text convolutional network. Additionally, we propose a stochastic batch-wise training routine, where high-quality binary codes and deep encoding functions are efficiently optimized in an alternating manner. Experiments are conducted on three multimedia datasets, i.e. Microsoft COCO, IAPR TC-12, and INRIA Web Queries, where the proposed TVDB model significantly outperforms state-of-the-art binary coding methods in the task of cross-modal retrieval.

Journal ArticleDOI
TL;DR: In this article, the effects of MTL stimulation on memory performance were studied in five patients undergoing invasive electrocorticographic monitoring during various phases of a memory task (encoding, distractor, recall).

Journal ArticleDOI
TL;DR: A mechanistic model of human network recall is described and demonstrated its sufficiency for capturing human recall behavior observed in experimental contexts and it is found that human recall is predicated on accurate recall of a small number of high degree network nodes and the application of heuristics for both structural and affective information.
Abstract: The social brain hypothesis argues that the need to deal with social challenges was key to our evolution of high intelligence. Research with non-human primates as well as experimental and fMRI studies in humans produce results consistent with this claim, leading to an estimate that human primary groups should consist of roughly 150 individuals. Gaps between this prediction and empirical observations can be partially accounted for using “compression heuristics”, or schemata that simplify the encoding and recall of social information. However, little is known about the specific algorithmic processes used by humans to store and recall social information. We describe a mechanistic model of human network recall and demonstrate its sufficiency for capturing human recall behavior observed in experimental contexts. We find that human recall is predicated on accurate recall of a small number of high degree network nodes and the application of heuristics for both structural and affective information. This provides new insight into human memory, social network evolution, and demonstrates a novel approach to uncovering human cognitive operations.

Journal ArticleDOI
TL;DR: HIV-infected patients were significantly less accurate on the working memory task and their neuronal dynamics indicated that encoding operations were preserved, while memory maintenance processes were abnormal, suggesting impairments likely reflect deficits in the maintenance of memory representations.
Abstract: Impairments in working memory are among the most prevalent features of HIV-associated neurocognitive disorders (HAND), yet their origins are unknown, with some studies arguing that encoding operations are disturbed and others supporting deficits in memory maintenance. The current investigation directly addresses this issue by using a dynamic mapping approach to identify when and where processing in working memory circuits degrades. HIV-infected older adults and a demographically-matched group of uninfected controls performed a verbal working memory task during magnetoencephalography (MEG). Significant oscillatory neural responses were imaged using a beamforming approach to illuminate the spatiotemporal dynamics of neuronal activity. HIV-infected patients were significantly less accurate on the working memory task and their neuronal dynamics indicated that encoding operations were preserved, while memory maintenance processes were abnormal. Specifically, no group differences were detected during the encoding period, yet dysfunction in occipital, fronto-temporal, hippocampal, and cerebellar cortices emerged during memory maintenance. In addition, task performance in the controls covaried with occipital alpha synchronization and activity in right prefrontal cortices. In conclusion, working memory impairments are common and significantly impact the daily functioning and independence of HIV-infected patients. These impairments likely reflect deficits in the maintenance of memory representations, not failures to adequately encode stimuli.

Book ChapterDOI
04 Jan 2017
TL;DR: This work proposes Spatio-temporal VLAD (ST-VLAD), an extended encoding method which incorporates spatio-tem temporal information within the encoding process by proposing a video division and extracting specific information over the feature group of each video split.
Abstract: Encoding is one of the key factors for building an effective video representation. In the recent works, super vector-based encoding approaches are highlighted as one of the most powerful representation generators. Vector of Locally Aggregated Descriptors (VLAD) is one of the most widely used super vector methods. However, one of the limitations of VLAD encoding is the lack of spatial information captured from the data. This is critical, especially when dealing with video information. In this work, we propose Spatio-temporal VLAD (ST-VLAD), an extended encoding method which incorporates spatio-temporal information within the encoding process. This is carried out by proposing a video division and extracting specific information over the feature group of each video split. Experimental validation is performed using both hand-crafted and deep features. Our pipeline for action recognition with the proposed encoding method obtains state-of-the-art performance over three challenging datasets: HMDB51 (67.6%), UCF50 (97.8%) and UCF101 (91.5%).

Journal ArticleDOI
TL;DR: The findings indicate that congruent events can trigger an accelerated onset of neural encoding mechanisms supporting the integration of semantic information with the event input, which would result in a long-lasting and meaningful memory trace for the event.
Abstract: As the stream of experience unfolds, our memory system rapidly transforms current inputs into long-lasting meaningful memories. A putative neural mechanism that strongly influences how input elements are transformed into meaningful memory codes relies on the ability to integrate them with existing structures of knowledge or schemas. However, it is not yet clear whether schema-related integration neural mechanisms occur during online encoding. In the current investigation, we examined the encoding-dependent nature of this phenomenon in humans. We showed that actively integrating words with congruent semantic information provided by a category cue enhances memory for words and increases false recall. The memory effect of such active integration with congruent information was robust, even with an interference task occurring right after each encoding word list. In addition, via electroencephalography, we show in 2 separate studies that the onset of the neural signals of successful encoding appeared early (∼400 ms) during the encoding of congruent words. That the neural signals of successful encoding of congruent and incongruent information followed similarly ∼200 ms later suggests that this earlier neural response contributed to memory formation. We propose that the encoding of events that are congruent with readily available contextual semantics can trigger an accelerated onset of the neural mechanisms, supporting the integration of semantic information with the event input. This faster onset would result in a long-lasting and meaningful memory trace for the event but, at the same time, make it difficult to distinguish it from plausible but never encoded events (i.e., related false memories). SIGNIFICANCE STATEMENT Conceptual or schema congruence has a strong influence on long-term memory. However, the question of whether schema-related integration neural mechanisms occur during online encoding has yet to be clarified. We investigated the neural mechanisms reflecting how the active integration of words with congruent semantic categories enhances memory for words and increases false recall of semantically related words. We analyzed event-related potentials during encoding and showed that the onset of the neural signals of successful encoding appeared early (∼400 ms) during the encoding of congruent words. Our findings indicate that congruent events can trigger an accelerated onset of neural encoding mechanisms supporting the integration of semantic information with the event input.

Posted Content
TL;DR: In this article, a novel integrated deep architecture is developed to effectively encode the detailed semantics of informative images and long descriptive sentences, named as Textual-Visual Deep Binaries (TVDB).
Abstract: Cross-modal hashing is usually regarded as an effective technique for large-scale textual-visual cross retrieval, where data from different modalities are mapped into a shared Hamming space for matching. Most of the traditional textual-visual binary encoding methods only consider holistic image representations and fail to model descriptive sentences. This renders existing methods inappropriate to handle the rich semantics of informative cross-modal data for quality textual-visual search tasks. To address the problem of hashing cross-modal data with semantic-rich cues, in this paper, a novel integrated deep architecture is developed to effectively encode the detailed semantics of informative images and long descriptive sentences, named as Textual-Visual Deep Binaries (TVDB). In particular, region-based convolutional networks with long short-term memory units are introduced to fully explore image regional details while semantic cues of sentences are modeled by a text convolutional network. Additionally, we propose a stochastic batch-wise training routine, where high-quality binary codes and deep encoding functions are efficiently optimized in an alternating manner. Experiments are conducted on three multimedia datasets, i.e. Microsoft COCO, IAPR TC-12, and INRIA Web Queries, where the proposed TVDB model significantly outperforms state-of-the-art binary coding methods in the task of cross-modal retrieval.

Proceedings ArticleDOI
01 Sep 2017
TL;DR: A method to compress the enormous amount of data originating from tactile sensors is presented that explicitly exploits the inherent sparseness over space and time, sending tactile “events” only when a contact is detected.
Abstract: We propose a method to compress the enormous amount of data originating from tactile sensors is presented that explicitly exploits the inherent sparseness over space and time, sending tactile “events” only when a contact is detected. The resulting modular architecture is based on FPGA modules that acquire data samples from off-the-shelf tactile sensors based on capacitive transducers and generate and transmit an event-driven readout. This architecture has been specifically implemented for integration on robots with a large number of tactile sensors, to reduce communication bandwidth, power and processing requirements. An asynchronous serial address-event representation protocol further optimises effective data transmission rate (efficiency of 94.1%) and latency (340 ns) with respect to more common transmission protocols (e.g., Ethernet, CAN). We propose two complementary algorithms for the translation of raw-data into events, optimising data rate and bandwidth, or exploiting the asynchronous nature of the event-driven encoding and the temporal information within the sensory signal. Data reduction capability can reach up to 20 % of the correspondent clock-based encoding, with limited information loss due to the compression.