Welcome to the fifth edition of ML News Monthly – Feb 2021!!
Here are the key happenings this month in the Machine Learning field that I think are worth knowing about. 🕸
1) Bollywood Movies Still Connect Beauty with Fair Skin, Reveals AI-Based Study
2) New deep learning models require fewer neurons
3) Startup says A.I. helped it find treatment for rare lung disease in record time
Hong Kong–based biotechnology company, Insilico Medicine, which uses A.I. tools to help it find potential new therapies, announced Tuesday it has brought a drug candidate from an initial scientific hunch to the cusp of human clinical trials in less than 18 months, a time span the company says may be a new record for a process that often takes more than four years.
4) USA Data Science Job Market Shrinking as Data Engineering Grows Exponentially, New Study by Interview Query
5) New Contextual Calibration Method Boosts GPT-3 Accuracy Up to 30%
6) India’s 40 Under 40 Data Scientists
This award brings together the brightest leaders in the Data Science field in India and celebrates their achievements.
7) Digital Owl emerges from stealth with AI that analyzes and summarizes medical records
8) Korea adopting Israeli Technology Breakthrough for Learning English
MagniLearn, a leading Israeli ed-tech company that is transforming English learning with purely personalised technology announced a partnership today with Korea’s “The Education Company”, a leading network of schools with over 5,000 students throughout Korea and with Kim Venturous as a local strategic partner.
9) Google BERT vs SMITH: How They Work & Work Together
Earlier, on ‘Search Engine Journal’, the author Roger Montti covered the Google research paper on a new Natural Language Processing algorithm named SMITH.
The conclusion? That SMITH outperforms BERT for long documents.
10) Transformers Scale to Long Sequences With Linear Complexity Via Nyström-Based Self-Attention Approximation
11) Snap partners with ShareChat’s Moj to roll out Camera Kit
Snap has partnered with ShareChat’s Moj app to integrate its Camera Kit into the Indian app as the American social giant looks to accelerate its growth in the world’s second largest internet market
12) Why ML in Production is (still) Broken and Ways we Can Fix it
13) 4 PyTorch Lightning Community NLP Examples To Inspire Your Next Project!
14) 3 PyTorch Lightning Winning Community Kernels to Inspire your Next Kaggle Victory
15) Retrieval Augmented Generation with Huggingface Transformers and Ray
16) ELLIS NLP kick-off workshop
ELLIS (European Laboratory for Learning and Intelligent Systems, https://ellis.eu) is a European grassroots initiative in AI and ML with a focus on scientific excellence, innovation, and societal impact. The new established ELLIS NLP program, which is led by Iryna Gurevych, André Martins, and Ivan Titov includes NLP fellows and scholars from 15 European institutions (https://ellis.eu/programs/natural-language-processing)
17) Exploring hyperparameter meta-loss landscapes with Jax
This post will walk through an example showing how extraordinarily complex meta-loss landscapes can emerge from a relatively simple setting and as a result gradients of these loss landscapes become a lot less useful. This is done using a relatively new machine learning library: Jax.
18) Evolving Neural Networks in JAX
19) Parallelizing neural networks on one GPU with JAX
20) NLP for India – a relentless pursuit in innovation and creativity
21) India Budget 2021 : Finance Minister allocates Rs 50,000 crore for National Research Foundation
22) China’s ed tech unicorns prove that remote learning can work
23) Spotify patents tech to recommend songs based on users’ speech, emotion
The music-streaming company Spotify has been granted a patent for technology that aims to interpret users’ speech and background noise to better curate the music it serves up.
FAANG / GAFAM / FANGAM / BATX
24) Google – Introducing Model Search: An Open Source Platform for Finding Optimal ML Models
25) Facebook AI’s Multitask & Multimodal Unified Transformer: A Step Toward General-Purpose Intelligent Agents
26) Speller100: Zero-shot spelling correction at scale for 100-plus languages
Microsoft has recently launched a large-scale multilingual spelling correction models worldwide with high precision and high recall in 100-plus languages! These models, technology they collectively call Speller100, are currently helping to improve search results for these languages in Bing.
27) Improving Mobile App Accessibility with Icon Detection
28) Azure Quantum is now in Public Preview
Azure Quantum, the world’s first full-stack, public cloud ecosystem for quantum solutions, is now open for business. Developers, researchers, systems integrators, and customers can use it to learn and build solutions based on the latest innovations—using familiar tools in the public cloud.
29) Google’s Voice AI accelerator launches 12 startups
30) Facebook’s Continual Transfer Learning Benchmark
31) Wav2Vec 2
Authors at Facebook & Hugging face show for the first time that learning powerful representations from speech audio alone followed by fine-tuning on transcribed speech can outperform the best semi-supervised methods while being conceptually simpler.
32) The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
Authors introduce GEM, a living benchmark for natural language Generation (NLG), its Evaluation, and Metrics. GEM provides an environment in which models can easily be applied to a wide set of corpora and evaluation strategies can be tested. Regular updates to the benchmark will help NLG research become more multilingual and evolve the challenge alongside models.
33) ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision
In this paper, Authors present a minimal VLP model, Vision-and-Language Transformer (ViLT), monolithic in the sense that processing of visual inputs is drastically simplified to just the same convolution-free manner that we process textual inputs. They show that ViLT is up to 60 times faster than previous VLP models, yet with competitive or better downstream task performance.
34) Aspect-Sentiment Embeddings for Company Profiling and Employee Opinion Mining
With the multitude of companies and organizations abound today, ranking them and choosing one out of the many is a difficult and cumbersome task.
Authors aim to overcome the aforementioned problem by generating aspect-sentiment based embedding for the companies by looking into reliable employee reviews of them. They created a comprehensive dataset of company reviews from the famous website Glassdoor.com and employed a novel ensemble approach to perform aspect-level sentiment analysis.
35) Speech Recognition by Simply Fine-tuning BERT
Authors propose a simple method for automatic speech recognition (ASR) by fine-tuning BERT, which is a language model (LM) trained on large-scale unlabeled text data and can generate rich contextual representations. The assumption is that given a history context sequence, a powerful LM can narrow the range of possible choices and the speech signal can be used as a simple clue.
36) Learning the language of viral evolution and escape
37) Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models
Courses / Resources
38) Hugging Face on PyTorch / XLA TPUs: Faster and cheaper training
39) A Complete Machine Learning Project From Scratch: Setting Up
40) Keeping Up with PyTorch Lightning and Hydra — 2nd Edition
41) How Positional Embeddings work in Self-Attention (code in Pytorch)
42) Papers with Code – PyTorch Image Models
43) Python Outlier Detection (PyOD)
PyOD is a comprehensive and scalable Python toolkit for detecting outlying objects in multivariate data.
44) CS329s Lecture 3: Data engineering (Chip Huyen Notes)
45) Multilingual and code-switching ASR challenges for low resource Indian languages
46) Pororo: A Deep Learning based Multilingual Natural Language Processing Library
47) tez: train pytorch models fasterrrrr
48) Question Generation using 🤗transformers
49) Jina AI
Jina is a deep learning-powered search framework for building cross-/multi-modal search systems (e.g. text, images, video, audio) on the cloud.
50) 6 Things in SaaS That Are Only Obvious At Scale
51) Startup Freshworks Hits $300 Million in Sales With IPO Looming
52) BudgetML: Deploy ML models on a budget
53) Python Concurrency: The Tricky Bits
54) How to use RAPIDS on Amazon SageMaker
55) Hugging Face Transformers Package – What Is It and How To Use It
That’s it !!
Let me know if I missed anything or if there’s anything you think should be included in a future post.