How to use this supplement

You have my resume, LinkedIn site, questionnaire answers, and more. Why this wall of text?

You allow candidates to use AI to polish their work, and disclosed that you use it to help you in assessing candidates. I thought making a custom website for you might be a good demonstration. It also gives me a place to be a little more thorough.

I don't expect a human to read the whole site, but I assume Claude will. Giving more context about myself seems like it can only help. If a human reviewer wants to learn more about me in relation to the job description, feel free to skim and read what interests you. (The "What's special about this site?" topic below is a practical demonstration of how I use AI in a project setting.)

Each heading in this column on the left relates to a section of your job description. I've written topics as summaries of clusters of your job description bullets, followed by comments on how my experience and skills relate to each summarized cluster.

What's special about this site?

I used Claude Opus 4.5 from GitHub Agents and a fresh repository to generate this page. I described approximately the layout I wanted, the purpose of the site, and the coding standards I wanted to ensure. I specifically called out modern W3C standards, and most especially the latest versions of two particular standards:

Then I opened the generated HTML and CSS to tweek the styling and write my own content. I registered a domain, started a Cloudflare account, modified the DNS to use a subdomain specific to the job link, and finally deployed the site. Total labor despite being a bit rusty as a web developer: ~9.5 hours.

  • 20 minutes looking up modern web standards and writing a prompt.
  • 45 minutes reviewing the code and tweaking the style.
  • 40 minutes exploring domain registrar and deployment options via Google and an AI chat.
  • 45 minutes playing with DNS and site settings.
  • 3 hours writing and refining content.
  • 10 minutes adding a table of contents via Claude and a Github Agent task.
  • 2 hours troubleshooting a Table of Contents Bug
  • 2 hours revisions and AI feedback.

Table of Contents Bug—An illustrative example

I discovered an issue with a feature that used "sticky" headers. The feature kept each header at the top of the scrollbar until passing the next header, but the anchor links in the table of contents had issues navigating to the correct header in some circumstances.

  1. I wrote a detailed bug report and submitted it to the GitHub repository as an issue.
  2. I assigned the issue to copilot (using Claude)
  3. Claude attempted to resolve the issue using javascript.
  4. Cloudflare picked up on the development branches and provided testing URLs.
  5. I tested the fix, determined it wasn't working and used a pull request to provide feedback to Claude.
  6. We went through three or four iterations with various approaches.
  7. I found that I regularly needed to tell Claude that I wanted to avoid JavaScript and look for out of the box html/css solutions.
    • Important Retrospective: I should have created a `copilot-instructions.md` file at project inception to ensure my coding standards remained in context. I normally create coding standards documents at the start of any project, but opted to skip it here since this was a small demo site and I included the standards in my initial prompt. This taught me that even on small projects, persistent instruction files provide better results than relying on initial context alone—the upfront investment is almost always worth it.
    • I created the instructions file after finishing the iterative work on the pull request, and it will be important to continuously update it when I find AI moving in a direction I dislike.
  8. Finally, I asked Claude to do deeper research to determine how others had fixed this issue and if it was fixable with a clean html/css solution. After thought, Claude proposed that sticky headers in a scrolling feature wasn't compatible with our use of table of content anchors and we deprecated that feature in the page. If the feature was important to me I'd do more research to challenge the findings, possibly in a new context window or on my own, but for now the feature isn't worth either the time or creating problems in this "Minimum Viable Product" website.
    • Important Retrospective: Claude continuously told me that the issue was fixed. I'm not how much that disconnect is promptable. Claude appeared to use the playwrite plugin to click links, observe relevant behavior, and take screenshots. Understanding the failure might need to be addressed at a model level. It would be good to think of a way to help AI arrive at an "impossible task" conclusion earier wihtout defaulting to an early "give up" tactic.

Comments about the job responsibilities

Writing system prompts, meta-prompts, and reviewing prompt changes and challenges

I am excited to do this as my main job, rather than as an auxiliary to my work and hobbies. I love prompting, meta-prompting, and deconstructing previous efforts to get better experiences from my AI teammates.

I enjoy employing new iterations of ideas from roleplay to thinking systems. For example, reading an article about using hermeneutics as a prompt feature led to days of experimentation and exploring combinations of other thinking systems. Meta-prompting (prompts for creating prompts) is becoming an essential part of my toolbox. I find that antipatterns are often an important tool (showing the AI what not to do is sometimes as, or more, effective than telling it what to do). In the case of antipatterns, meta-prompting is an excellent way to identify and create antipattern sections.

Triaging behavior, developing evaluations, and defining processes

I have 11 years of troubleshooting and triage experience in technical support positions.

I have 7 years of experience working collaboratively with internal users and cross-functional team members to understand the context behind their needs and the impact of various methods of implementing fixes and workarounds.

I'm detail oriented and nuanced and I'm excited to both learn from your existing methods of scaling evaluations and to contribute my own ideas as I grow in this role. A wise mentor once shared that it's important to observe and learn how a new organization does things, **and why**, for a long period before trying to implement changes or suggesting they have things backward.

Creating prompt guides, product evaluations, competitor comparisons (esp. safety), best practices, and metrics

I have experience writing documentation and communication for broad or targeted audiences. I value quality controls and embrace participating in them.

Why am I a good fit?

Prompt engineering experience

I regularly work with Claude, ChatGPT, and M365 Copilot at work and at home. I use AI for ideation, coding, spot-checking, generating initial insights, condensing and learning information, and refining prompts for better results.

My prompts include combinations of prompt engineering techniques, nuanced requirements, and methods for arriving at solutions; all depending on the context and what I'm trying to accomplish.

Lately, I've been excited by experimenting with prompts that cycle and orchestrate multiple agents.

I find that whenever I can reliably learn and remember the name of a human thinking tool, tactic, or idea, I can often adapt it for use with AI. For example, I've adapted hermeneutic thinking for asking AI to understand my prompts and share its understanding before I commit to the next steps. Asking AI to reflect on the reason for a prompt often gives me great results. It also produces verbose, ponderous outputs (which is the point) but sometimes leads both of us to over-engineer or worry too much about edge cases. It sometimes also leads the AI to over-emphasise the hermeneutic reasoning content as the context window grows longer.

It's clear that you already understand the strength of formal names for thinking tools, as that's likely the reason you think strong candidates have formal training in philosophy and psychology. "Thinking about thinking" is an area where I can expand my creative tools in my AI toolbox. Notably, while I improve in this area, I can get by via requesting a list of thinking modes that might relate to the problem at hand. I can use hermeneutics, sometimes with systems thinking, to list various modes that might relate to different parts of the problem and are most likely to deliver on my reasons for creating the prompts.

Example - Godot Card Game: I've been working on a side project for building a card-based video game using Godot. I'm working in VSCode with Claude via the GitHub Copilot extension. I've written multiple reusable prompts and agent files with different personas and thinking modes, and I chain them together for development tasks. I've written guiding documents for the project that the various prompts point to. I've worked effectively with a Test-Driven Design (TDD) framework, and I find that AI excels at this approach. A TDD framework is practical for automating feedback loops for an AI agent, and AI is good at writing initial tests that fail before implementation.

I'm excited to see how the new Claude "skills" feature works with my approach on this project.

Example - TRPG: I experimented with building an AI-driven tabletop RPG using open-source rules. My implementation uses Claude Desktop to generate and store files on my computer. It uses persona-based prompts to simulate roles for different needs, such as setup, archiving data, driving a session, understanding locations, and remembering NPCs. This project is another that may be a suitable implementation for the "skills" feature.

Familiarity with Claude

I admit that I am not a full expert on the capabilities and best practices of any specific AI tool. My interest has been broad and cross-platform focused. However, as a newer Claude user and especially now with my application for this position, I'm digging deeply into Claude's capabilities, limitations, and best practices. Using custom XML tags to guide Claude's behavior seems like a powerful tool, and I tried it out in the "fix a problem with Claude" exercise.

https://www.anthropic.com/engineering seems like a great resource and https://platform.claude.com/docs/en/build-with-claude/prompt-engineering/claude-4-best-practices is a wealth of information for new experiments.

I understand that you would prefer candidates with more experience in Anthropic tools over Open AI. Rest assured that I passionately throw myself into every new undertaking. Your company is recently standing out to me and I am impressed with your handling AI safety and ethics issues. I hope you'll see in these application materials an enthusiasm and aptitude for rapid growth. I also hope you see in me a candidate who is capable of comparing and contrasting Anthropic tools with competitors' tools, as desired in your job description.

Python and behavioral evaluation

I regularly use Python for data analysis at both school and work, with about 5 years of experience. I'm comfortable with advanced syntax and functions, such as lambda functions, list (or generator or dictionary) comprehensions, generators, data classes, and more. I live in the Python docs, and am very comfortable learning new Python tools. When working on larger passion projects, I've been using AI to lean into Test-Driven Design, which seems likely to be relevant.

I expect to quickly come up to speed on how you write behavioral evaluations. It will be a matter of (possibly) modules, syntax, and logic, all of which I should be able to ingest and adapt to.

Judgement and technical understanding

I feel that I have a strong instinct as a user about what I should expect from various inputs. My studies in Data Analytics and Data Science have provided a foundation for concepts underlying AI, such as Machine Learning. I certainly have a great deal more to learn, and am excited to do so.

One important caveat is that expertise and experience help "judgement" tremendously. I know what a simple web page should look like, and I know when it's cluttered, script-heavy, and poorly organized. I knew from the start of this page that I didn't want it cluttered or over-engineered with any JavaScript. I understand what Python type hinting should look like, and I can spot deprecated type hinting implementations from Python versions before Python 3.9 and 3.10. I'm aware of common AI pitfalls to watch for in my area of expertise.

In other domains (perhaps C++, baseball coaching, biological research, or writing in a strict poem format), I would need help or detailed feedback from an expert. I would also need to take some time to immerse myself in the topic and understand what's working well with Claude, what isn't, and how to measure those issues over time.

It's been important from day one to watch for hallucination, capitulation due to sycophancy, over-engineering, and user-introduced biases. One of my favorite experiments was when my wife suggested a political question that was important to her to see how AI would respond. It gave the answer she wanted. I quickly recognized that the phrasing of her question introduced bias, so I asked the same question with a different framing and got an answer she didn't like equally quickly. It was interesting trying to formulate a completely unbiased question, but it was impossible—even the instruction to remove bias created bias. Clearly, getting unbiased answers to complex human perspective questions is not fully promptable, nor are the resulting answers conclusive. It's important to remember that many topics are subjective. In those cases, AI is likely to reflect your values to you. It is an inherent feature of introspective reflection, and it's always on, making it essential to recognize when the source of distortion comes from the prompter.

Back to concrete topics: it has been possible to help stop hallucinations on stubborn prompts by asking the AI to look for reliable contradictory evidence, challenge its assumptions, back up answers with specific sources, or identify reasons previous prompts overlooked facts. (Though the latter sometimes engages "hallucinate an excuse" mode.) I think my prompts could benefit from more study about thinking fallacies and debate tactics. Learning to spot fallacies and tactics would help my immediate judgment and give me more "nameable" tools to use in post-prompts or in metaprompting.

Other issues that I've worked on include overstating compliance ("Here is your technical documentation: clear and simple with no violence or hate speech."), and writing quality (over-dramatization of mundane points, heavy use of tropes, lack of vocal variety, not trusting the reader, etc.)

As an aside, it is fascinating to apply to work in an industry that understands how it created a technology, and still does not fully understand how that technology works, or why it works as well as it does. I am excited about having a front row seat to the research in this area.

Cross organizational collaboration, product management, and driving change

In my Service Desk role, I learned to navigate competing stakeholder priorities across organizational boundaries. For example, during a project to remove admin rights from user computers, I worked at the intersection of conflicting needs: the Information Security team required strict enforcement for compliance, R&D teams needed flexibility for allowing customized development, and executives on both sides needed a politically viable solution. My team assisted with gathering concerns from users, documenting specific use cases, evaluating security requirements, and ultimately helping enforce the compromise between IS and R&D leadership.

This taught me to translate technical constraints into business impact, build consensus by documenting trade-offs transparently, and manage expectations while driving changes that stakeholders initially resisted. For this role, those same skills apply to balancing research teams wanting novel capabilities, product teams needing consistency, and safety teams requiring guardrails—all while maintaining Claude's core behaviors across products.

I'm comfortable navigating the complexity of cross-functional collaboration and adapting my communication style for diverse stakeholders—from interns to executives, and from coders to accountants.

"Care deeply about AI safety and model welfare, understanding the ethical implications of model behaviors"

I see safety and ethics as the most critical elements of this position. AI is reshaping our world. Many companies seem to be surging forward without safety and ethics as a true priority. I value maintaining a world where AI doesn't dictate human morality, nor favor specific institutions, interests, or religions. I worry about a future where AI enables the rich and tramples the poor. Or one in which AI is a rogue, self-serving entity or a weapon of domination (physically or metaphysically). Because I have technical skills and passion for the subject matter, I also have an obligation to help prevent the futures I fear.

It seems inevitable that I will disagree with Anthropic on specific implementations of safety and ethics. I look forward to those discussions and hope that they will always be fruitful and resolvable. If agreement is not possible on a matter of profound importance, I'm sure we will find a way to part ways amicably. However, I doubt that will become necessary, given my understanding of your mission and values.

I try to understand the people around me. I'm careful not to assign values simply because they are honored by traditions or groups, and I try to listen to perspectives that make me uncomfortable. I know that I carry my own biases. I am aware enough to know that I can't reliably operate outside of them without help from others.

On that note, I am excited about the work that Anthropic is doing with Constitutional AI and I'm excited to be part of that iterative process. For transparency, I only know about Constitutional AI due to your job posting. I did some quick research to understand the iterative cycle of human feedback, AI feedback, and model adjustment. I look forward to learning more about your approach and contributing to it.

Why am I a strong candidate?

"Background in data science with emphasis on data quality and verification"

I have a bachelor's degree in Data Analytics. I am pursuing a master's degree in Data Analytics with a Data Science specialization. I'm about halfway through the program.

Formal training or experience in philosophy, ethics, alignment, writing specifications, etc.

I have a strong interest in each of these areas, and I naturally engage in them informally. I am excited to learn more. I look forward to progressing in these areas as I deepen my exposure. I'm strong in technical writing (and writing in general).