MS-ASL: A Large-Scale Data Set and Benchmark for Understanding American Sign Language.
Citations
263 citations
237 citations
Cites background from "MS-ASL: A Large-Scale Data Set and ..."
..., YouTube [62]) where signer provenance and skill is unknown....
[...]
...Purdue RVL-SLLL ASL [65] 104 14 no 2,576 yes no RWTH Boston 104 [124] 104 3 no 201 yes no Video-Based CSL [54] 178 50 no 25,000 yes no Signum [118] 465 (24 train, 1 test) - 25 yes 15,075 yes no MS-ASL [62] 1,000 (165 train, 37 dev, 20 test) - 222 yes 25,513 no yes RWTH Phoenix [43] 1,081 9 no 6,841 yes yes RWTH Phoenix SI5 [74] 1,081 (8 train, 1 test) - 9 yes 4,667 yes yes Devisign [22] 2,000 8 no 24,000 no no...
[...]
..., [62]) – these posters may be fluent signers, interpreters, or sign language students; such videos are typically of “real-life” signs (i....
[...]
183 citations
Cites background from "MS-ASL: A Large-Scale Data Set and ..."
...However, most of the research to date has mainly focused on Isolated Sign Language Recognition [35, 75, 72, 10, 63, 67], working on application specific datasets [11, 71, 23], thus limiting the applicability of such technologies....
[...]
100 citations
94 citations
Cites background or methods or result from "MS-ASL: A Large-Scale Data Set and ..."
...I3D results are reported from the original papers for MSASL [34] and WLASL [40]....
[...]
...I3D† denotes our implementation and training, adopting the hyper-parameters from [34]....
[...]
...MSASL [34] ASL 7 1000 25K (25) 222 lexicons, web WLASL [40] ASL 7 2000 21K (11) 119 lexicons, web...
[...]
...Following [34, 40], we report both top-1 and top-5 classification accuracy, mainly due to ambiguities in signs which can be resolved in context....
[...]
...Specifically, we follow the I3D architecture [13] due to its success on action recognition benchmarks, as well as its recently observed success on sign recognition datasets [34, 40]....
[...]
References
72,897 citations
"MS-ASL: A Large-Scale Data Set and ..." refers methods in this paper
...We used VGG16 [55] network followed by an average pooling and LSTM layer of size 256 with batch normalization....
[...]
...This method use GoogleNets [59] as 2D-CNN with 2 bi-directional LSTM layers and 3 state HMM....
[...]
...The experimental result suggests that this data set is very difficult for 2D-CNN or at least LSTM could not pass the recurrent information well....
[...]
...Motivated by [17], we picked LSTM [28] as our recurrent layer which records the temporal ordering and long range dependencies by encoding the states....
[...]
55,235 citations
"MS-ASL: A Large-Scale Data Set and ..." refers methods in this paper
...We used VGG16 [55] network followed by an average pooling and LSTM layer of size 256 with batch normalization....
[...]
49,914 citations
49,639 citations
"MS-ASL: A Large-Scale Data Set and ..." refers methods in this paper
...We started with pre-trained network trained on Imagenet [16] and Kinetics [10]....
[...]
40,257 citations