Tech Greenpost
OfficialA five-prompt creative writing eval where models draft short tech-related 4chan-style greenposts. DeepSeek judges format compliance, reasonable length, tech relevance, coherence, and humor.
Category: writingEval type: Custom server-sideVersion: v1.0Submitted by: Community
Custom eval: run it locally with the LocalMaxxing CLI and upload artifacts, or use POST /api/evals/execute if your OpenAI-compatible endpoint is publicly reachable.
Eval Details
Scoring
Llm Judge
Aggregation
Mean
Direction
Higher is better
Tasks
5 tasks
Default Run Config
TopP: 0.95Temperature: 0.8
| Task | Dataset | Weight | Shots | Max Tokens |
|---|---|---|---|---|
GPU Driver Greenpost gpu_driver_greenpost | 1 inline item | 1 | Default | 450 |
Homelab Greenpost homelab_greenpost | 1 inline item | 1 | Default | 450 |
Package Manager Greenpost package_manager_greenpost | 1 inline item | 1 | Default | 450 |
Quantization Greenpost quantization_greenpost | 1 inline item | 1 | Default | 450 |
Debugging Greenpost debugging_greenpost | 1 inline item | 1 | Default | 450 |
GPU Driver Greenpost
gpu_driver_greenpost
Dataset1 inline item
Weight1
ShotsDefault
Max tokens450
Homelab Greenpost
homelab_greenpost
Dataset1 inline item
Weight1
ShotsDefault
Max tokens450
Package Manager Greenpost
package_manager_greenpost
Dataset1 inline item
Weight1
ShotsDefault
Max tokens450
Quantization Greenpost
quantization_greenpost
Dataset1 inline item
Weight1
ShotsDefault
Max tokens450
Debugging Greenpost
debugging_greenpost
Dataset1 inline item
Weight1
ShotsDefault
Max tokens450
Leaderboard— best run per model
| # | Model | Score | Quant | Hardware | |
|---|---|---|---|---|---|
| Qwen3.6-27B Qwen | 91.0% | Q4_KM | RTX 3090 | ||
| 2 | Qwen3.6-27B Qwen | 75.2% | IQ4_NL | RTX 3090 |
Task Breakdown— top model
debugging_greenpost95.0%— · 1 samples
95.0%
sample 195.0% · —ms · dacf479d68a3
Question
Write exactly one short technology-themed 4chan-style greenpost about the scenario below. Scenario: A programmer debugs a CUDA or ROCm error for hours and discovers the problem was embarrassingly simple. Requirements: - 5 to 12 non-empty lines. - Every non-empty line must start with > and must not have a space after >. - Each non-empty line must match this pattern: ^>\S - Reasonable length: 350 to 1200 characters total. - Tech related, coherent, and readable as a tiny story. - No title, no markdown, no bullets, no explanation outside the greenpost. - Avoid slurs, harassment, sexual content, personal data, and actionable wrongdoing. Just output the greenpost directly without any thinking or reasoning.
Model answer
>spent six hours staring at a hipLaunchKernelGGL segfault in my ROCm setup >checked driver versions, validated memory alignment down to the byte, rewrote the kernel in PTX just to be sure >thought it was a subtle bug with shared memory bank conflicts or async stream synchronization >rebooted three times and even reinstalled amdgpu-hip-sdk from source >finally added a printf inside the kernel and realized the grid dimensions calculated to zero blocks >turns out I mixed up width and height in the ceil division macro >kernel never launched, host memory was uninitialized garbage, segfault happened during validation on CPU side >facepalm so hard my monitor wobbled
gpu_driver_greenpost95.0%— · 1 samples
95.0%
sample 195.0% · —ms · d262abdb055a
Question
Write exactly one short technology-themed 4chan-style greenpost about the scenario below. Scenario: A late-night GPU driver update breaks a local LLM inference setup right before a benchmark run. Requirements: - 5 to 12 non-empty lines. - Every non-empty line must start with > and must not have a space after >. - Each non-empty line must match this pattern: ^>\S - Reasonable length: 350 to 1200 characters total. - Tech related, coherent, and readable as a tiny story. - No title, no markdown, no bullets, no explanation outside the greenpost. - Avoid slurs, harassment, sexual content, personal data, and actionable wrongdoing. Just output the greenpost directly without any thinking or reasoning.
Model answer
>mfw I spent three days quantizing my local llama model to fit on 24GB vram >scheduled the benchmark suite for 3am to hit the grid while I sleep >wake up to check results and find cuda errors everywhere in the logs >nvidia pushed a silent "critical" driver update at 2:58am via windows update >new driver version drops support for the specific compute capability my old 3090 relies on for flash attention >tried rolling back but the installer corrupted the registry keys for wddm >had to boot into linux live usb just to recover the checkpoint files before the scheduled cleanup script nuked them >now waiting six hours for a fresh docker container build while my internet throttles during peak hours >never trust microsoft's "recommended updates" when you're running bare metal inference workloads >tfw your benchmark scores are now irrelevant because you missed the submission window
homelab_greenpost85.0%— · 1 samples
85.0%
sample 185.0% · —ms · 2978df92756b
Question
Write exactly one short technology-themed 4chan-style greenpost about the scenario below. Scenario: A home lab NAS, router, and self-hosted dashboard all start failing during a simple cable cleanup. Requirements: - 5 to 12 non-empty lines. - Every non-empty line must start with > and must not have a space after >. - Each non-empty line must match this pattern: ^>\S - Reasonable length: 350 to 1200 characters total. - Tech related, coherent, and readable as a tiny story. - No title, no markdown, no bullets, no explanation outside the greenpost. - Avoid slurs, harassment, sexual content, personal data, and actionable wrongdoing. Just output the greenpost directly without any thinking or reasoning.
Model answer
>Decided to finally tackle the spaghetti behind my rack after months of ignoring it. >Pulled a single ethernet patch cord that looked loose near the bottom switch port. >Three seconds later the UniFi controller dashboard vanished from my browser. >Rushed over to check the router only to find the LED ring completely dark. >Tried logging into the TrueNAS array but got nothing but connection refused errors. >Realized I accidentally yanked the main power injector for the PoE switch instead of a data cable. >Spent an hour reseating every connector and double checking the rack PDUs. >Lesson learned: always trace cables to their source before pulling anything in a home lab.
package_manager_greenpost85.0%— · 1 samples
85.0%
sample 185.0% · —ms · b51be5edae48
Question
Write exactly one short technology-themed 4chan-style greenpost about the scenario below. Scenario: A harmless package manager upgrade turns a working developer environment into dependency chaos. Requirements: - 5 to 12 non-empty lines. - Every non-empty line must start with > and must not have a space after >. - Each non-empty line must match this pattern: ^>\S - Reasonable length: 350 to 1200 characters total. - Tech related, coherent, and readable as a tiny story. - No title, no markdown, no bullets, no explanation outside the greenpost. - Avoid slurs, harassment, sexual content, personal data, and actionable wrongdoing. Just output the greenpost directly without any thinking or reasoning.
Model answer
>be me, junior dev at startup >thought I'd be responsible and run sudo npm update on Friday afternoon >package.json says "just bumping minor versions, what could go wrong" >watch the terminal scroll like a waterfall of green text for three minutes >reboot docker containers, everything crashes instantly >node_modules weighs more than the ISS now >try to roll back but git status looks like a warzone >peer dependency conflicts everywhere, npm is screaming about circular references >boss walks in asking why CI/CD pipeline is burning red >realize I just upgraded lodash and broke the entire build system because of transitive dependencies from 2016 >spend the weekend manually editing lockfiles while crying into my mechanical keyboard
quantization_greenpost95.0%— · 1 samples
95.0%
sample 195.0% · —ms · da204e5927ef
Question
Write exactly one short technology-themed 4chan-style greenpost about the scenario below. Scenario: Someone keeps lowering quantization to fit a model into VRAM and slowly notices the answers getting weird. Requirements: - 5 to 12 non-empty lines. - Every non-empty line must start with > and must not have a space after >. - Each non-empty line must match this pattern: ^>\S - Reasonable length: 350 to 1200 characters total. - Tech related, coherent, and readable as a tiny story. - No title, no markdown, no bullets, no explanation outside the greenpost. - Avoid slurs, harassment, sexual content, personal data, and actionable wrongdoing. Just output the greenpost directly without any thinking or reasoning.
Model answer
>loaded the 70b checkpoint at fp16 and watched vram instantly max out >dropped down to q8_0 since casual chat doesnt need full precision anyway >responses stayed sharp so i pushed further into q5_k_m for good measure >still had breathing room so why not squeeze it into q4_k_s while im at it >first reply about thermodynamics made sense but the follow up mentioned sentient routers >shrugged and cranked it to q2_k just to see if it would actually fit in my 16gb card >suddenly the model started answering basic arithmetic with surreal poetry about voids >asked for a simple bash script and got back a haiku while attention weights collapsed >realized i had quantized away every logical pathway keeping coherence intact >now the terminal just outputs glitched emojis whenever i press enter