Scientists believe they might have found a way to overcome “model collapse”, a phenomenon that threatens the training of AI as we know it. To become better, artificial intelligence systems such as those used in ChatGPT must be trained with more and more real data. But much of that data is taken from writing on the internet, which itself is often produced using such models. The amount of real data is rapidly reducing – and is predicted to run out as early as this year. And data that is produced by other AI systems could rapidly lead to “model collapse”. That refers to a phenomenon where AI systems engage in “data cannibalism”, where they are trained on their own outputs, and come rapidly less useful and more prone to dangerous falsehoods. But researchers have suggested that using just one datapoint from the outside world can prevent the problem. They did so using a set of models called “Exponential Families”, a set of statistical models.…