Ai - Duly Noted

Compound mistakes: an agent often needs to perform multiple steps to accomplish a task, and the overall accuracy decreases as the number of steps increases. If the model’s accuracy is 95% per step, over 10 steps, the accuracy will drop to 60%, and over 100 steps, the accuracy will be only 0.6%.

Herein lies the rub with agents. Once they tumble down a bad path, how can they recover? Reminds me of the rumor(?) that an AI Model, when instructed to last as long as possible in tetris merely paused the game.

we found out that at least 65% of this resource-consuming traffic we get for the website is coming from bots, a disproportionate amount given the overall pageviews from bots are about 35% of the total. This high usage is also causing constant disruption for our Site Reliability team, who has to block overwhelming traffic from such crawlers before it causes issues for our readers.

I do wonder about the future of internet. Crawling is nothing new, its how google built its original search engine. But if 65% of traffic is robots where are the humans?

Long-term maintainability: This is the most insidious impact radius because it has the longest feedback loop, these issues might only be caught weeks and months later. These are the types of cases where the code will work fine for now, but will be harder to change in the future. Unfortunately, it’s also the category where my 20+ years of programming experience mattered the most.

Article covering all the ways an expert engineering had to guide AI, out of pitfalls, away from landmines and most importantly towards a long-term sustainable architecture. If AI is truly a Jr Engineer, code quality will regress (GitClear research on AI depresses code quality)

Industry leaders don’t have a good track record of predicting AI developments. … As an example, Sutskever had an incentive to talk up scaling when he was at OpenAI and the company needed to raise money. But now that he heads the startup Safe Superintelligence, he needs to convince investors that it can compete with OpenAI, Anthropic, Google, and others, despite having access to much less capital. Perhaps that is why he is now talking about running out of data for pre-training, as if it were some epiphany and not an endlessly repeated point.

We found an average time savings of 5.4% of work hours in the November 2024 survey. For an individual working 40 hours per week, saving 5.4% of work hours implies a time savings of 2.2 hours per week. When we factor in all workers, including nonusers, workers saved 1.4% of total hours because of generative AI.

Note this is across all disciplines. Some tasks are more easily assisted by AI than others. But this outlines a broad trend of at least ~5% improvement across users.

The benefits of AI-driven automation often favour capital over labour, which could widen inequality and reduce the competitive advantage of low-cost labour in developing economies.
However, the UNCTAD report also highlights inequalities between nations, with U.N. data showing that 40% of global corporate research and development spending in AI is concentrated among just 100 firms, mainly those in the U.S. and China.

First, original story is pay-walled for journalist only, so I was unable to review that. Second, no surprise that the wealthiest countries are using their capital to pursue AI. Here’s hoping humanity seizes the opportunity to improve everyone’s lives…

Through a combination of static code analysis tools (such as CodeQL), fuzzing the GRUB2 emulator (grub-emu) with AFL++, manual code analysis, and using Microsoft Security Copilot, we have uncovered several vulnerabilities. … Copilot identified multiple security issues, which we refined further by requesting Copilot to identify and provide the five most pressing of these issues. In our manual review of the five identified issues, we found three were false positives, one was not exploitable, and the remaining issue, which warranted our attention and further investigation, was an integer overflow vulnerability.

The thing I’m most excited about in our weird new AI-enhanced reality is the way it allows me to be more ambitious with my projects. As an experienced developer, ChatGPT (and GitHub Copilot) save me an enormous amount of “figuring things out” time.

I think the key is in the expertise. As noted in other blogs AI generated code is the dopamine sugar rush of our time. Just because its sweet doesn’t mean its right. Trust Simon and his expertise, but he’s still publishing Python code.

OpenAI argues that copying the style of a movie studio, rather than of a living artist, is allowed. (I imagine Disney would not support this argument.) Yet other artists in the United States are already suing OpenAI, and other A.I. companies, for training its tools on their artwork and infringing on their styles

Understanding that AI can only generate what its been trained on, I think its quite obvious that all AI have a Copyright problem.

I’m 4 days into an afternoon project … Like Icarus, my codebase is irrecoverable. A tangled heap of wing fragments and melted wax, dripping with half-baked ideas and unsupervised AI chaos. My grand vision of outsourcing grunt work to AI had sent me soaring, but the sun of reality burned away any hope of landing gracefully.

I feel you nemo. This silly little micro blog I thought I could whip up with AI (the api gateway portion). But alas it took me fragmented hours over several weeks before I finally understood enough to accomplish what I want.

Ai

How AI Agents might practically work

Robots all the way down

Coding Expertise with Vibes

AI Hype machine

AI Across all disciplines

AI and labour implications, not great

AI Augments Finding Vulnerabilities, Not Replaces

Reach for the stars with AI coding

Studio Ghibili and ChatGPT

Sr Eng AI Coding project crash-out