The greatest theft in history? -- How big tech is benefiting from our data and why we should care [Q&A]

By Ian Barker
Published 2 days ago

If you've written a book or composed some music or engaged in any other form of creative endeavour, the chances are that AI will have used it as part of its learning process.

This appropriation of copyright material is controversial since the original creator doesn't benefit.

We talked to Jamie Dobson, founder of Container Solutions and author of The Cloud Native Attitude, to discuss what he's calling 'the greatest theft in history'.

BN: How are tech companies using data to train AI?

JD: Most of the AI systems in the news or flying through your social media streams are systems built around Artificial Neural Networks (ANN). ANNs, before they are useful, are trained on data, such as text or images. That training process leaves the 'neurons' in the network 'weighted' and, together, they’ll process incoming data in the exact way that humans process incoming data from the world helping you to spot obstacles as you cross the street.

Once the ANN is trained, as you were trained on pictures of cats by your mum or dad, the network is ready and the data no longer needed. The evidence, that’s to say the data, can be thrown away and it’s almost impossible to work out from the ANN which data it trained on. Just as it is impossible for me, 48 years after it happened, to tell you which pictures of cats I was trained on as a child.

That's how companies use data to train ANN. If that data is bought, is proprietary or used with permission, fine. If it is stolen, then that is, of course, the opposite of fine.

BN: Why is this such a problem?

JD: Theft is a crime. That's why it's a problem. When it comes to training ANNs, however, it's a crime that is hard to detect. Take a look at the arches under a train line or station. Arches are built with a temporary wooden structure called a falsework. When the arch is finished, the falsework vanishes. It is lost to history just as footprints on the beach are lost to the tides. The training data for ANNs is falsework. The ANN is the arch.

If those creating the falsework, the books, the movies, the poems, the artwork, that ANNs are trained on are not paid at all, production will cease. ANNs will stagnate and we will all suffer. However, if those creatives are not paid fairly, then the gains of their work will continue to be appropriated by massive companies who have got all the engineers and all the models. Does anybody think that such companies, without any government oversight, should have such power or that this will end well for the public?

BN: What do you think of the UK government's plans to allow tech companies to legally use copyrighted content for AI training?

JD: It's flawed. It makes it too hard for creatives to opt out. The government is weighing up the impact AI might have on society and considering sacrificing fairness on the alter of progress. There is precedent for this. During the first and second world war, patents were breached and new technologies created through patent theft. It made sense. Given the potential destruction of the world and the way of life for countries like the UK and the USA, who would care about a few patents? After the war ended, the patent courts sorted it all out.

Does this moment in history warrant such a great theft? No. It can be done differently. Creatives can get paid and we need to pay them so they keep creating. Governments can create the legal frameworks. And more than anything, they must make sure the gains of all this data and the ANNs they are used to train benefit us all and not just a handful of American tech giants.

BN: Why should Joe Average care about what's happening?

JD: Because Joe Average has children who are creative, because Joe average pays for the education of creatives through their taxes and they pay for the public infrastructure that funds the internet. We should no more turn our data over to foreign companies than we should give them our oil or top scientists. People are not stupid and soon the losses of such assets will be felt in their cost of living.

BN: Clearly AI is here to stay, what can we do to protect individual and corporate data rights?

JD: The solutions to this challenge need to be as innovative as the technology causing it. Here are several approaches we should consider:

Data Rights and Compensation: Establish a framework where data creators receive compensation for their contributions to AI training. This could work in a similar way as it does with musicians who receive royalties when their songs are played.
Algorithmic Transparency: Require AI companies to maintain and disclose training data sources, making it possible for creators to track and verify the use of their work.
Public AI Infrastructure: Develop public alternatives to private AI models, ensuring that the benefits of this technology aren't concentrated in corporate hands.
Progressive AI Taxation: Implement a scaled taxation system for AI companies based on their data usage and market impact, funding public services and potentially a Universal Basic Income.
Digital Commons Framework: Create a new category of digital rights that balances innovation with fair compensation, perhaps through a system of micropayments or credit attribution.

Image credit: Fernando Gregory/Dreamstime.com

1 Comment

The greatest theft in history? -- How big tech is benefiting from our data and why we should care [Q&A]

Recent Headlines

Google brings scheduling options to its Gemini AI app

Microsoft is going to slow down Windows 11 later this year – but it’s for your own good

Addressing the challenges of securing remote work environments

The digital pitfalls that can make or break a startup's success

Best Windows apps this week

Why security tools are failing developers and what needs to change [Q&A]

Fewer than half of UK public sector organizations are backing up data daily

Most Commented Stories

Windows 7 Reloaded solves Windows 11's biggest problem -- download it now

Nintendo says your Switch 2 isn’t really yours even if you paid for it

Trump tells Apple: Make iPhones in America or face 25 percent tariff

Crapfixer 1.0 is here to fix Windows 11 and turn it into the operating system you deserve -- download it now!

Microsoft is ruining Notepad with pointless formatting in Windows 11

Forget CCleaner -- BleachBit 5.0 is here to remove junk, broken files and bloat from Windows 10/11 and Linux

Donald Trump secures China trade deal that may ease smartphone and PC prices

Fences 6.0 is the essential desktop upgrade for Windows 10 and 11 users -- get it today!