Category NLP

413. Tips For Developing Vector Databases

▮ Using Vector Stores The combination of vector databases and LLMs, such as retrieval-augmented generation, has created a massive impact on the AI industry. When adopting these technologies, how you develop and maintain your personal vector databases becomes significantly important.…

411. Procedures For Text Generation Projects

▮ Framework I am currently involved in several text-generation model development for non-creative tasks. So for this post, I’d like to share what I’ve learned on how to proceed text-generation projects. If you can specifically define all the elements in…

410. LLM Reasoning

▮ LLM Reasoning Despite the impressive performance of LLMs across many tasks, their reasoning processes can still inadvertently introduce hallucinations and accumulated errors. For this post, I’d like to share what I’ve learned from several state-of-the-art research in this field,…

409. Multi-Stage-Reasoning Using LLMs

▮ LLM Tasks VS LLM-Based Workflows LLMs are great at single tasks, but when we want to utilize LLMs in real-world applications, there is rarely a case where there is only 1 task. Typical applications are more than just a…

408. What LLMs Suck At

▮ LLMs Generative AI has been trending for quite a while, and I’ve been curious about the actual credibility of these models. The outputs these models create are so persuasive that it seems too good to be true. So for…

146. BERT

What is BERT? BERT is a deep learning architecture for natural language processing. If you stack the Transformer’s encoder, you get BERT. What can BERT Solve? Neural Machine Translation Question Answering Sentiment Analysis Text Summarization How to solve the problems…

80. Shift in Recommendation Filtering

Over the last several years, there has been a trend in recommendation systems shifting from COLLABORATIVE FILTERING to CONTENT-BASED FILTERING. Each workflow are the following. Let’s say we want to recommend a restaurant to a user. Collaborative Filtering 1. looks…

58. Word Embedding

Let’s say we have 300 genres, that would mean each column of the table would be a 300×1 dimension vector representing the nuance of that word. Vectorizing this nuance is called word-embedding.By making the machine learn this nuance(the table above),…