DL Research Intern, TotalEnergies DATALAB, France
May 2023 — September 2023
TotalEnergies is one of the largest global energy companies. I joined the DATALAB team, the Research and Development center responsible for managing the company's geoscience data. During my time here, I focused on developing a multimodeal Neural Search Engine solution using Retrieval Augmented Generation (RAG) and explored the capabilities of Language Models for Named Entity Recognition (NER), comparing them to the BERT model.
Responsabilities :
- Implemented and benchmarked the state-of-the-art LLM fine-tuning methods like LoRA, QLoRa, FSDP, etc.
- Proposed an advanced RAG pipeline with pre- and post-retrievals using PGVector, a sentence-transformers model, and Llama2.
- Worked on improving the semantic representation by fine-tuning the embedding model on geoscience data.
- Worked on optimizing inference time with quantization techniques and CUDA programming.
- Implemented and compared multiple strategies of Partial Parameter Fine-tuning for NER tasks on geoscience data.