LLM-as-Judge for Hallucination Detection: Does the Critic Agent Actually Work?
This is a follow-up to From arXiv to SEC: Building a Multi-Agent Financial Report Analyst with LangGraph. That post ended with: “The remaining question is...
This is a follow-up to From arXiv to SEC: Building a Multi-Agent Financial Report Analyst with LangGraph. That post ended with: “The remaining question is...
DefectVision: Building a Real-Time Manufacturing Defect Detector Trained on Normal Images Only
From arXiv to SEC: Building a Multi-Agent Financial Report Analyst with LangGraph
This is not a technical post. It is an account of how I ended up studying Data Science and AI at the University of Liverpool after four years of Naval Arc...
This post is a deep dive into the fine-tuning experiment from the arXiv RAG System. That post summarised it in a section - this one documents every detail...
This is a follow-up to arXiv RAG System: Engineering an Academic Paper Q&A System from Scratch. The system was functionally complete after 7 days, but...
arXiv RAG System: Engineering an Academic Paper Q&A System from Scratch
TORCS Corkscrew Challenge: A Journey Through Reinforcement Learning Failures and Breakthroughs