ArkhamMirror: Airgapped investigation platform with CIA-style hypothesis testing
Posted by ArkhamMirror 20 hours ago
Comments
Comment by ArkhamMirror 20 hours ago
What makes this different:
Air-gapped: Zero cloud dependencies. Uses local LLMs via LM Studio (Qwen, etc.)
ACH Methodology: Implements the CIA's "Analysis of Competing Hypotheses" technique which forces you to look for evidence that disproves your theories instead of confirming them
Corpus Integration: Import evidence directly from your documents with source links
Sensitivity Analysis: Shows which evidence is critical, so if it's wrong, would your conclusion change?
The ACH feature just dropped with an 8-step guided workflow, AI assistance at every stage, and PDF/Markdown/JSON export with AI disclosure flags. It's better than what any given 3-lettered agency uses.
Tech stack: Python/Reflex (React frontend), PostgreSQL, Qdrant (vectors), Redis (job queue), PaddleOCR, Spacy NER, BGE-M3 embeddings.
All MIT licensed. Happy to answer questions about the methodology or implementation! Intelligence for anyone.
Links: Repo https://github.com/mantisfury/ArkhamMirror
ACH guide with screenshots at https://github.com/mantisfury/ArkhamMirror/blob/reflex-dev/d...
Comment by daft_pink 15 hours ago
It looks cool.
Comment by ArkhamMirror 14 hours ago
Short answer - no, not right now.
However, instead of going through locally hosted docker and local LLMs, you could reroute it wherever you like, but I don't have a cloud option set up at this time.
I'm focused on the developing the local, private applications myself, but nothing is stopping someone from hooking it up to stronger cloud-based stuff if they want.
The good news is that my plans for this include making it more modular, so people have better options for what it does and how powerful it is.
Comment by V__ 18 hours ago
Comment by ArkhamMirror 17 hours ago
Comment by btown 9 hours ago
What are the competing hypotheses, other than fraud, when a person makes a massive luxury purchase, but with red-flag-adjacent inconsistencies in other information provided? If we need to identify whether there's a weird or competitive ownership relationship behind a potential opportunity, how do we determine if an initial hypothesis about relationships is correct?
If ArkhamMirror has an online mode with web search as a tool call, I'd be curious to try it out to automate some of these ACH-adjacent workflows.
Comment by cess11 17 hours ago
Commonly there is a lot of information and it might as well be unstructured, and then I need to get answers quickly because my clients aren't going to pay me for going about it slowly.
Comment by ArkhamMirror 17 hours ago
Comment by gslepak 11 hours ago
Notice the "Knowledge Graph" feature that lets you "Visualize hidden connections between People, Orgs, and Places" just like the cork board meme.
This is the essence of what good "conspiracy theorists" do. Whenever investigative journalists uncover a conspiracy among the elite, they are talked down to and dismissed as "conspiracy theorists". But that is what good conspiracy theorists are: investigative journalists.
Comment by ArkhamMirror 11 hours ago
If I had the skills, I would totally map that onto a cork board.
Comment by Theofrastus 18 hours ago
This is super interesting. I will probably (hopefully?) never need to use it, but interesting nonetheless. It also makes sense to have this type of application airgapped. Journalists need to have near-perfect OPSEC depending on what they are working on.
Comment by ArkhamMirror 18 hours ago
Comment by jerlendds 12 hours ago
I'm loving the approach you took to the UI! I had some similar ideas in mind and plan to build narrative reconstruction and timeline view tools too so it's really nice to see how others have done so! I'll definitely be following your work and I shared your project in the OSINTBuddy discord to hopefully get some more eyes on it :)
Great work, I hope you keep at it :)
Comment by ArkhamMirror 12 hours ago
My approach to security so far has been to keep it air-gapped and include a nukeitfromorbit.bat that will do everything but physically destroy your SSD to keep your privacy intact.
The narrative reconstruction tool was pretty fun to make, and it's been impressive in testing, but the real test will be if it actually helps someone in a real investigation.
If you see anything in my project that could help your project, then that's awesome news to me!
I'm definitely going to keep working, and hopefully soon it's going to do some pretty cool stuff. All the best to you and OSINTBuddy
Comment by sloped 15 hours ago
Comment by ArkhamMirror 15 hours ago
Comment by nilamo 16 hours ago
Comment by ArkhamMirror 16 hours ago
Comment by ckbkr10 18 hours ago
I do think though that this approach will become annoying quick:
https://github.com/mantisfury/ArkhamMirror/blob/main/scripts...
Comment by ArkhamMirror 18 hours ago
Comment by ChrisbyMe 13 hours ago
Comment by ArkhamMirror 13 hours ago
I don't have any background as an analyst or anything like that. ACH is a real tool, really used by the CIA, and the existing versions are basically crappy spreadsheets, or not free, or both.
I don't doubt someone with coding skills could do it better, it's just that no one else has stepped up. Probably because there's no profit angle, but that's conjecture on my part.
Comment by ArkhamMirror 17 hours ago
Comment by ajcp 12 hours ago
I really would like to know how good this would be for a corporate Internal Audit workflow/professional.
Comment by ArkhamMirror 12 hours ago
Is there any particular function you had in mind?
ArkhamMirror can also scan your corpus for near duplicates, clusters, can check for signs of people using copy-paste in their work, find designated red flags, regex data, and that sort of thing. It's really generalized for as many use cases as possible at this stage, and I'm about to start working on modularity for specialization soon, so feel free to make suggestions on how you'd want to use it.
Comment by VerifiedReports 12 hours ago
Comment by ArkhamMirror 12 hours ago
There's an isolated venv/ in the project folder, so no global packages or system python mods.
If your python is 3.11+, the install should recognize it. If you have 3.10 or lower, it's going to prompt you to install 3.11 for the project environment through winget or python.org. If you are running multiple pythons, it uses py -3.11 to pick the version.
For Docker, the app is going to want you to already have docker running, and will want to make and utilize 3 containers (PostgresSQL, Qdrant, Redis) in their own isolated docker-compose project. It uses nonstandard ports, but there could be conflicts there if you have stuff running on 5435, 6343/6344 or 6380. The backend wants to run at 8000, and the frontend wants to run at 3000, so those could conflict potentially as well.
The script is going to check if docker is running - if it is, you should be set. If it's not, it's going to prompt you to start it up.
Nothing in the install should touch your docker daemon config or your existing containers.
Let me know how it works for you!
Comment by VerifiedReports 11 hours ago
I do development on my machine, so I like to control its environment deliberately.
Comment by ArkhamMirror 11 hours ago
I get it - pretty much everything I've been working with to build this platform is basically brand new to me, or just brand new in general, so I have to be wary of how I do things too.
Comment by smallerfish 16 hours ago
Comment by ArkhamMirror 16 hours ago
Comment by afro88 14 hours ago
That's a ton of scope for hallucinations, surely?
Comment by ArkhamMirror 14 hours ago
If you use a smaller model with smaller context, it might be more prone to hallucinations and provide less nuanced suggestions, but the default model seems to be able to handle the jobs pretty well without having to regenerate output very often (it does happen sometimes, but it just means you have to run it again.) Also, depending on the model, you might get less variety or creativity in suggestions. It's definitely not perfect, and it definitely shouldn't be trusted to replace human judgement.
Comment by darkwater 16 hours ago
Comment by ArkhamMirror 16 hours ago
Comment by stocksinsmocks 13 hours ago
Comment by ArkhamMirror 13 hours ago
Comment by zero0529 10 hours ago
Comment by ArkhamMirror 5 hours ago
Comment by Garlef 17 hours ago
Comment by ArkhamMirror 17 hours ago
Comment by gosub100 12 hours ago
Comment by ArkhamMirror 11 hours ago
Also, it's true that a lot of the existing tools that do similar things are anything but free.
I can imagine most or all of the things ArkhamMirror does are done elsewhere by other programs and tools. I don't know of any unclassified projects that do ACH better, but that's a pretty niche tool, and the government loves their 20-year-old software solutions.
Off-the-shelf programs designed for use by lawyers have layers of protections built in to make sure they are suitable for court-use. I don't make any claims as to the legal utilities of this program whatsoever. In fact, the ACH PDF report generated specifically calls attention to the AI-generated nature of the materials and warns against using any data generated or entered without human review and approval.
That said, you can make some pretty cool, non-legally useful, connections with tools like author unmask, where you feed the system docs by a known author and run them against docs written by an unknown or suspected alias to check for similar voice. During ingestion, the system automatically yanks all detected Regex data and puts it into a nice sortable, searchable list for you.
Legal e-discovery products are going to be highly polished, reliable programs designed to be used in a legal setting, while ArkhamMirror is designed to be used while you sit in your faraday cage in your hacker cabin in the woods with no Wi-Fi.
No shade intended - my stuff's not nearly as pretty or as well-put together as a decent off-the-shelf e-discovery program and I'm not trying to imply that it's better in any way, it's just differently aligned.
Comment by 0xdeadbeefbabe 13 hours ago
Comment by ArkhamMirror 11 hours ago
There are LLM limitations on the call to generate hypotheses to return them in a certain format and to return a certain number of them, and that sort of thing, so it's usually in your best interest to use the LLM as more of an assistant to check if you missed anything or for a push to get started looking in different directions more than having the AI doing the whole thing (although if you are being lazy or don't know what to do, you could let the LLM do pretty much everything - I pretty much let the LLM handle everything it could in testing.)
Comment by jrflowers 18 hours ago
Comment by doodlebugging 15 hours ago
Comment by snapcaster 17 hours ago
Personally as an american i'm quite optimistic on peak stupid being ahead of us :)
Comment by CrazyStat 15 hours ago
[1] Strictly speaking it would be 1/e of your stupidity sightings, which may not be 1/e of your life. If you intend to retire early and become a hermit you may want to stop the exploration phase earlier.
Comment by Y_Y 16 hours ago
Comment by chamomeal 18 hours ago