What is BERT?
BERT is a deep learning architecture for natural language processing. If you stack the Transformer’s encoder, you get BERT.
What can BERT Solve?
- Neural Machine Translation
- Question Answering
- Sentiment Analysis
- Text Summarization
How to solve the problems above?
There are mainly 2 phases.
1. Pre-training phase
In this phase, the model is simultaneously trained for both MLM(Masked Language Model: Predict the masked word) and NSP(Next Sentence Prediction: Predict whether 2 sentences are following each other considering context) tasks for language understanding.
2. Fine-Tuning phase
After pre-training, the model will have deep understanding of language(What is English? What is Context?), so now you just have to fine-tune it to achieve your desired specific task.