ML News Monthly – Feb 2021

Welcome to the fifth edition of ML News Monthly – Feb 2021!!

Here are the key happenings this month in the Machine Learning field that I think are worth knowing about. 🕸

1) Bollywood Movies Still Connect Beauty with Fair Skin, Reveals AI-Based Study

2) New deep learning models require fewer neurons

3) Startup says A.I. helped it find treatment for rare lung disease in record time

Hong Kong–based biotechnology company, Insilico Medicine, which uses A.I. tools to help it find potential new therapies, announced Tuesday it has brought a drug candidate from an initial scientific hunch to the cusp of human clinical trials in less than 18 months, a time span the company says may be a new record for a process that often takes more than four years.

4) USA Data Science Job Market Shrinking as Data Engineering Grows Exponentially, New Study by Interview Query

5) New Contextual Calibration Method Boosts GPT-3 Accuracy Up to 30%

6) India’s 40 Under 40 Data Scientists

This award brings together the brightest leaders in the Data Science field in India and celebrates their achievements.

7) Digital Owl emerges from stealth with AI that analyzes and summarizes medical records

8) Korea adopting Israeli Technology Breakthrough for Learning English

MagniLearn, a leading Israeli ed-tech company that is transforming English learning with purely personalised technology announced a partnership today with Korea’s “The Education Company”, a leading network of schools with over 5,000 students throughout Korea and with Kim Venturous as a local strategic partner.

9) Google BERT vs SMITH: How They Work & Work Together

Earlier, on ‘Search Engine Journal’, the author Roger Montti covered the Google research paper on a new Natural Language Processing algorithm named SMITH.

The conclusion? That SMITH outperforms BERT for long documents.

10) Transformers Scale to Long Sequences With Linear Complexity Via Nyström-Based Self-Attention Approximationöm-based-self-attention-c67c851ddc8a

11) Snap partners with ShareChat’s Moj to roll out Camera Kit

Snap has partnered with ShareChat’s Moj app to integrate its Camera Kit into the Indian app as the American social giant looks to accelerate its growth in the world’s second largest internet market

12) Why ML in Production is (still) Broken and Ways we Can Fix it

13) 4 PyTorch Lightning Community NLP Examples To Inspire Your Next Project!

14) 3 PyTorch Lightning Winning Community Kernels to Inspire your Next Kaggle Victory

15) Retrieval Augmented Generation with Huggingface Transformers and Ray

16) ELLIS NLP kick-off workshop

ELLIS (European Laboratory for Learning and Intelligent Systems, is a European grassroots initiative in AI and ML with a focus on scientific excellence, innovation, and societal impact. The new established ELLIS NLP program, which is led by Iryna Gurevych, AndrĂ© Martins, and Ivan Titov includes NLP fellows and scholars from 15 European institutions (

17) Exploring hyperparameter meta-loss landscapes with Jax

This post will walk through an example showing how extraordinarily complex meta-loss landscapes can emerge from a relatively simple setting and as a result gradients of these loss landscapes become a lot less useful. This is done using a relatively new machine learning library: Jax.

18) Evolving Neural Networks in JAX

19) Parallelizing neural networks on one GPU with JAX

20) NLP for India – a relentless pursuit in innovation and creativity

21) India Budget 2021 : Finance Minister allocates Rs 50,000 crore for National Research Foundation

22) China’s ed tech unicorns prove that remote learning can work

23) Spotify patents tech to recommend songs based on users’ speech, emotion

The music-streaming company Spotify has been granted a patent for technology that aims to interpret users’ speech and background noise to better curate the music it serves up.


24) Google – Introducing Model Search: An Open Source Platform for Finding Optimal ML Models

25) Facebook AI’s Multitask & Multimodal Unified Transformer: A Step Toward General-Purpose Intelligent Agents

26) Speller100: Zero-shot spelling correction at scale for 100-plus languages

Microsoft has recently launched a large-scale multilingual spelling correction models worldwide with high precision and high recall in 100-plus languages! These models, technology they collectively call Speller100, are currently helping to improve search results for these languages in Bing.

27) Improving Mobile App Accessibility with Icon Detection

28) Azure Quantum is now in Public Preview

Azure Quantum, the world’s first full-stack, public cloud ecosystem for quantum solutions, is now open for business. Developers, researchers, systems integrators, and customers can use it to learn and build solutions based on the latest innovations—using familiar tools in the public cloud.

29) Google’s Voice AI accelerator launches 12 startups

30) Facebook’s Continual Transfer Learning Benchmark


31) Wav2Vec 2

Authors at Facebook & Hugging face show for the first time that learning powerful representations from speech audio alone followed by fine-tuning on transcribed speech can outperform the best semi-supervised methods while being conceptually simpler.

32) The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics

Authors introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics. GEM provides an environment in which models can easily be applied to a wide set of corpora and evaluation strategies can be tested. Regular updates to the benchmark will help NLG research become more multilingual and evolve the challenge alongside models.

33) ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision

In this paper, Authors present a minimal VLP model, Vision-and-Language Transformer (ViLT), monolithic in the sense that processing of visual inputs is drastically simplified to just the same convolution-free manner that we process textual inputs. They show that ViLT is up to 60 times faster than previous VLP models, yet with competitive or better downstream task performance.

34) Aspect-Sentiment Embeddings for Company Profiling and Employee Opinion Mining

With the multitude of companies and organizations abound today, ranking them and choosing one out of the many is a difficult and cumbersome task.

Authors aim to overcome the aforementioned problem by generating aspect-sentiment based embedding for the companies by looking into reliable employee reviews of them. They created a comprehensive dataset of company reviews from the famous website and employed a novel ensemble approach to perform aspect-level sentiment analysis.

35) Speech Recognition by Simply Fine-tuning BERT

Authors propose a simple method for automatic speech recognition (ASR) by fine-tuning BERT, which is a language model (LM) trained on large-scale unlabeled text data and can generate rich contextual representations. The assumption is that given a history context sequence, a powerful LM can narrow the range of possible choices and the speech signal can be used as a simple clue.

36) Learning the language of viral evolution and escape

37) Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models

Courses / Resources

38) Hugging Face on PyTorch / XLA TPUs: Faster and cheaper training

39) A Complete Machine Learning Project From Scratch: Setting Up

40) Keeping Up with PyTorch Lightning and Hydra — 2nd Edition

41) How Positional Embeddings work in Self-Attention (code in Pytorch)

42) Papers with Code – PyTorch Image Models

43) Python Outlier Detection (PyOD)

PyOD is a comprehensive and scalable Python toolkit for detecting outlying objects in multivariate data.

44) CS329s Lecture 3: Data engineering (Chip Huyen Notes)

45) Multilingual and code-switching ASR challenges for low resource Indian languages

46) Pororo: A Deep Learning based Multilingual Natural Language Processing Library

47) tez: train pytorch models fasterrrrr

48) Question Generation using 🤗transformers

49) Jina AI

Jina is a deep learning-powered search framework for building cross-/multi-modal search systems (e.g. text, images, video, audio) on the cloud.

50) 6 Things in SaaS That Are Only Obvious At Scale

51) Startup Freshworks Hits $300 Million in Sales With IPO Looming

52) BudgetML: Deploy ML models on a budget

53) Python Concurrency: The Tricky Bits

54) How to use RAPIDS on Amazon SageMaker

55) Hugging Face Transformers Package – What Is It and How To Use It

That’s it !!

Let me know if I missed anything or if there’s anything you think should be included in a future post.