M
Mohammad Shoeybi
Researcher at Nvidia
Publications - 53
Citations - 3377
Mohammad Shoeybi is an academic researcher from Nvidia. The author has contributed to research in topics: Computer science & Jet (fluid). The author has an hindex of 15, co-authored 42 publications receiving 1589 citations. Previous affiliations of Mohammad Shoeybi include Sharif University of Technology & Stanford University.
Papers
More filters
Posted Content
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi,Md. Mostofa Ali Patwary,Raul Puri,Patrick LeGresley,Jared Casper,Bryan Catanzaro +5 more
TL;DR: A simple, efficient intra-layer model parallel approach that enables training transformer models with billions of parameters and shows that careful attention to the placement of layer normalization in BERT-like models is critical to achieving increased performance as the model size grows.
Proceedings Article
Deep Voice: Real-time Neural Text-to-Speech
Sercan O. Arik,Mike Chrzanowski,Adam Coates,Gregory Diamos,Andrew Gibiansky,Yongguo Kang,Xian Li,John J. Miller,Andrew Y. Ng,Jonathan Raiman,Shubho Sengupta,Mohammad Shoeybi +11 more
TL;DR: Deep Voice lays the groundwork for truly end-to-end neural speech synthesis and shows that inference with the system can be performed faster than real time and describes optimized WaveNet inference kernels on both CPU and GPU that achieve up to 400x speedups over existing implementations.
Journal ArticleDOI
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
Teven Le Scao,Angela Fan,Christopher Akiki,Elizabeth-Jane Pavlick,Suzana Ilic,Daniel Hesslow,Roman Castagn'e,Alexandra Luccioni,Franccois Yvon,Matthias Gallé,J. S. Tow,Alexander M. Rush,Stella Biderman,Albert Webson,Pawan Sasanka Ammanamanchi,Thomas Wang,Benoît Sagot,Niklas Muennighoff,A. Villanova del Moral,Olatunji Ruwase,R. Bawden,Stas Bekman,Angelina McMillan-Major,Iz Beltagy,Huu Nguyen,Lucile Saulnier,Samson Tan,Pedro Javier Ortiz Suárez,Victor Sanh,Hugo Laurenccon,Yacine Jernite,Julien Launay,Margaret Mitchell,Colin Raffel,Aaron Gokaslan,Adi Simhi,Aitor Soroa,Alham Fikri Aji,Amit Alfassy,Anna Rogers,Ariel Kreisberg Nitzav,Canwen Xu,Chenghao Mou,Chris Chinenye Emezue,Christopher Klamm,Colin D. Leong,Daniel van Strien,David Ifeoluwa Adelani,Dragomir R. Radev,Eduardo G. Ponferrada,Efrat Levkovizh,Ethan Kim,Eyal Natan,Francesco De Toni,Gérard Dupont,G. Kruszewski,Giada Pistilli,Hady Elsahar,Hamza Benyamina,H. Tran,Ian Yu,Idris Abdulmumin,Isaac Johnson,Itziar Gonzalez-Dios,Javier Galiana de la Rosa,Jenny Chim,Jesse Dodge,Jian Zhou,Jonathan Chang,Jorg Frohberg,Josephine Tobing,Joydeep Bhattacharjee,Khalid Almubarak,Kimbo Chen,Kyle Lo,Leandro von Werra,Leon Weber,Long Phan,Loubna Ben Allal,L Tanguy,Manan Dey,Manuel Romero Muñoz,Maraim Masoud,Mar'ia Grandury,Mario vSavsko,Max Huang,Maximin Coavoux,Mayank Singh,Mike Tian-Jian Jiang,Minh Chien Vu,M.A. Jauhar,Mustafa Ghaleb,Nishant Subramani,Nora Kassner,Nurulaqilla Khamis,Olivier Nguyen,Omar Espejel,Ona de Gibert,Paulo Villegas,Peter Henderson,Pierre Colombo,Priscilla Amuok,Quentin Lhoest,Rheza Harliman,Rishi Bommasani,R. L'opez,Salomey Osei,Sampo Pyysalo,Sebastian Nagel,Shamik Bose,Shamsuddeen Hassan Muhammad,Shanya Sharma,Shayne Longpre,Somaieh Nikpoor,Stanislav Silberberg,Suhas Pai,S Zink,Tiago Timponi Torrent,Timo Schick,Tristan Thrush,Valentin Danchev,Vassilina Nikoulina,Veronika Laippala,Violette Lepercq,V. Prabhu,Zaid Alyafeai,Zeerak Talat,Arun Raja,Benjamin Heinzerling,Chenglei Si,Elizabeth Salesky,Sabrina J. Mielke,Wilson Y. Lee,Abheesht Sharma,Andrea Santilli,Antoine Chaffin,Arnaud Stiegler,Debajyoti Datta,Eliza Szczechla,Gunjan Chhablani,Han Wang,Harshit Pandey,Hendrik Strobelt,Jason A. Fries,Jos Rozen,Leo Gao,Lintang A. Sutawika,M Saiful Bari,Maged S. Al-shaibani,Matteo Manica,Nihal V. Nayak,Ryan Teehan,Samuel Albanie,Sheng Shen,Srulik Ben-David,Stephen H. Bach,Taewoon Kim,T. G. Owe Bers,Thibault Févry,Trishala Neeraj,Urmish Thakker,Vikas Raunak,Xiang Tang,Zheng-Xin Yong,Zhiqing Sun,Shaked Brody,Y Uri,Hadar Tojarieh,Adam Roberts,Hyung Won Chung,Jae-Oong Tae,Jason Phang,Ofir Press,Conglong Li,Deepak Narayanan,Hatim Bourfoune,Jared Casper,Jeffrey Thomas Rasley,Maksim Riabinin,Mayank Mishra,Minjia Zhang,Mohammad Shoeybi,Myriam Peyrounette,Nicolas Patry,Nouamane Tazi,Omar Sanseviero,Patrick von Platen,Pierre Cornette,Pierre Franccois Lavall'ee,R. Lacroix,Samyam Rajbhandari,Sanchit Gandhi,Shaden Smith,S. Requena,Suraj Patil,Tim Dettmers,A. D. Baruwa,Anastasia Cheveleva,Anne-Laure Ligozat,Arjun Subramonian,Aur'elie N'ev'eol,Charles Lovering,Daniel H Garrette,Deepak R. Tunuguntla,Ehud Reiter,Ekaterina Taktasheva,E. Voloshina,Eli Bogdanov,Genta Indra Winata,Hailey Schoelkopf,Jan-Christoph Kalo,Jekaterina Novikova,Jessica Zosa Forde,Xiangru Tang,Jungo Kasai,Kenichi Kawamura,Liam Hazan,Marine Carpuat,Miruna-Adriana Clinciu,Najoung Kim,Newton Cheng,Oleg Serikov,Omer Antverg,Oskar van der Wal,Rui Zhang,Ruochen Zhang,Sebastian Gehrmann,Shachar Mirkin,S. Osher Pais,Tatiana Shavrina,Thomas Scialom,Tian Yun,Tomasz Limisiewicz,V. Rieser,Vitaly Protasov,Vladislav Mikhailov,Yada Pruksachatkun,Yonatan Belinkov,Zachary Bamberger,Zdenvek Kasner,Alice Rueda,A. Pestana,Amir Feizpour,Ammar Khan,Amy Faranak,A. Santos,Anthony Hevia,Antigona Unldreaj,Arash Aghagol,Arezoo Abdollahi,Aycha Tammour,Azadeh HajiHosseini,Bahareh Behroozi,Benjamin Olusola Ajibade,Bharat Kumar Saxena,Carlos Muñoz Ferrandis,Danish Contractor,David Lansky,Davis David,Douwe Kiela,Luong An Nguyen,Edward Chwee Kheng. Tan,Emily Baylor,Ezinwanne Ozoani,Fatim Tahirah Mirza,Frankline Ononiwu,Habib Rezanejad,H.A. Jones,Indrani Bhattacharya,Irene Solaiman,Irina Sedenko,Isar Nejadgholi,J. Lawrence Passmore,Joshua Seltzer,Julio Bonis Sanz,Lívia Macedo Dutra,Mairon Samagaio,Maraim Elbadri,M. Mieskes,Marissa Gerchick,Martha Akinlolu,Michael McKenna,Mike Qiu,M. K. K. Ghauri,Mykola Burynok,Nafis Abrar,Nazneen Fatema Rajani,Nour Elkott,Nourhan Fahmy,O. Samuel,Ran An,R. P. Kromann,Ryan Hao,Samira Alizadeh,Sarmad Shubber,Silas L Wang,Sourav Roy,Sylvain Viguier,Thanh-Cong Le,Tobi Oyebade,Trieu Hai Nam Le,Yoyo Yang,Zachary Nguyen,Abhinav Ramesh Kashyap,A. Palasciano,Alison Callahan,Anima Shukla,Antonio Miranda-Escalada,Ayush Kumar Singh,Benjamin Beilharz,Bo Wang,Caio Matheus Fonseca de Brito,Chenxi Zhou,Chirag Jain,Chuxin Xu,Clémentine Fourrier,Daniel Le'on Perin'an,Daniel Molano,Dian Yu,Enrique Peiró Sánchez Manjavacas,Fabio Barth,Florian Fuhrimann,Gabriel Altay,Giyaseddin Bayrak,Helena U Vrabec,Iman I.B. Bello,Isha Dash,Jihyun Kang,John M Giorgi,Jonas Golde,J. Posada,Karthi Sivaraman,Lokesh Bulchandani,Lu Liu,Luisa Shinzato,Madeleine Hahn de Bykhovetz,Maiko Takeuchi,Marc Pàmies,M Andrea Castillo,Marianna Nezhurina,Mario Sanger,Matthias Samwald,Michael Joseph Cullan,Michaela Django Weinberg,M. Wolf,Mina Mihaljcic,Minna Liu,Moritz Freidank,Myungsun Kang,Natasha Seelam,Nathan B Dahlberg,Nicholas Broad,N. Muellner,Pascale Fung,Patricia Haller,R. Chandrasekhar,R. Eisenberg,Robert Martin,Rodrigo L. Canalli,Rosaline Su,Ruisi Su,Samuel Cahyawijaya,Samuele Garda,Shlok S Deshmukh,Shubhanshu Mishra,Sid Kiblawi,Simon Ott,Sinee Sang-aroonsiri,Srishti Kumar,Stefan Schweter,Sushil Pratap Bharati,Tanmay Laud,Th'eo Gigant,Tomoya Kainuma,Wojciech Kusa,Yanis Labrak,Yashasvi Bajaj,Y. Venkatraman,Yifan Xu,Ying Xu,Yunchao Xu,Zhee Xao Tan,Zhong-li Xie,Zifan Ye,Mathilde Bras,Younes Belkada,T. Wolf +386 more
TL;DR: BLOOM as discussed by the authors is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total).
Journal Article
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
Shaden Smith,Md. Mostofa Ali Patwary,Brandon Norick,Patrick LeGresley,Samyam Rajbhandari,Jared Casper,Zhun Liu,Shrimai Prabhumoye,George Zerveas,Vijay Anand Korthikanti,Elton Zhang,Rewon Child,Reza Yazdani Aminabadi,Julie Bernauer,Xia Song,Mohammad Shoeybi,Yuxiong He,Mike Houston,Saurabh Tiwary,B. Catanzaro +19 more
TL;DR: The infrastructure as well as the 3D parallelism methodology used to train the largest monolithic transformer based language model, Megatron-Turing NLG 530B (MT-NLG), with 530 billion parameters is presented.
Posted Content
Deep Voice: Real-time Neural Text-to-Speech
Sercan O. Arik,Mike Chrzanowski,Adam Coates,Gregory Diamos,Andrew Gibiansky,Yongguo Kang,Xian Li,John J. Miller,Andrew Y. Ng,Jonathan Raiman,Shubho Sengupta,Mohammad Shoeybi +11 more
TL;DR: Deep Voice as discussed by the authors proposes a segmentation model for locating phoneme boundaries, a grapheme-to-phoneme conversion model, a phoneme duration prediction model, and an audio synthesis model.