Location: NJ / NYC (hybrid or remote preferred)
Remote: Yes
Willing to relocate: No
What I’ve been working on lately:
LLM introspection tooling and observability, mainly around token-level behavior during inference; Also experimented with inference-time interventions (no retraining) e.g. improved DeBERTa on HANS via targeted layer/head adjustments.
I just put up a small demo of part of the tooling (HF Space):
https://huggingface.co/spaces/anotheruserishere/Cartogemma
It exposes:
-per-head projections into token space (logit lens-style)
-token rank changes across layers for target-tokens ("rank displacement")
-top-k next-token branches with internal state views
-mute a head at a given L x H coordinate
-inject tokens or rewind context
There's a minimal example in the UI showing how token candidates stabilize (or don’t) across layers
Background: ~15 years in data science/ analytics (higher ed), mostly translating technical work into decisions & policy for leadership. More recently focused on LLM internals + tooling (Python/Rust, local model stacks, etc.) and agentic analytical tools for operational use. Background in philosophy / applied linguistics & NLP; previously designed and taught a course on propaganda (how language shapes reactions to media etc.).
Looking for: roles around LLM tooling, evals, interpretability, or applied AI where understanding model behavior is useful.
Tech: Python, Rust, SQL, embeddings, local LLM infra
Contact: [email protected]