For over three years, we’ve been refining this technology—and this year, something remarkable happened. We had the opportunity to test it on one of the most controversial private collections in recent memory: the Nina Moleva collection.
This collection may held Renaissance treasures and works by famous 19th and 20th century masters – so of course, we had to investigate. I examined nearly 100 pieces—icons from the 16th to 19th centuries, paintings, drawings—and for the first time, I integrated ArtCollecting’s AI not just as a tool, but as a collaborator, or an assistant. And what it revealed… surprised even me.
Because when artificial intelligence begins to see deeper than the human eye, it doesn’t just change how we authenticate art—it changes how we understand history itself.
Let me tell you how it all began. I was staring at a painting—one of those disputed works, the kind that could either be a forgotten masterpiece or a brilliant fake—and I thought: Why isn’t there an AI that can solve this? Not just a reverse image search like Google Lens, or a general-purpose tool like Yandex’s “Alice,” but something built specifically for the secrets art holds. Something that doesn’t just see, but understands.
So I started digging. I interviewed conservators, tech developers, even forgers—because to build something that catches deception, you have to know how deception works. And what would you realize? Authentication isn’t just about style or provenance. It’s about physics. Chemistry. The way a 400-year-old crack forms in varnish, the way pigments settle over centuries. The human eye is incredible, but it has limits. So I asked: What if we trained AI to see beyond them? Or to work with archeives faster and better?
That’s how ArtCollecting’s algorithms ware born. We built them to analyze not just brushstrokes, but the microscopic fingerprints of time—paint layers, corrosion patterns, even hidden sketches beneath the surface. And when Nina Moleva’s $2 billion collection surfaced, with its wild claims of Leonardos and Michelangelos, we knew this was the test.
I’ll never forget the first time we ran the scans. Moleva’s collection was unlike anything I’d encountered: Belyutin’s avant-garde works alongside XVI century icons, Monets next to Goncharova. The algorithm flagged things we’d have forgot – for example, ancient scripts. And that’s when it hit me: This isn’t just about verifying art. It’s about rewriting history.
Because here’s the truth—the next great art scandal won’t be solved by a connoisseur’s eye. It’ll be cracked open by code that reads a painting like a forensic scientist reads a crime scene. And if we get this right? We won’t just be authenticating masterpieces. We’ll be protecting the very idea of truth in art.
Questions – Answers
Yes, I believe that in the future, artificial intelligence will be used for art authentication. It will serve as a valuable assistant to art experts and reduce routine tasks. However, to accelerate progress toward this future, efforts must be made to digitize museum collections—this will significantly expand the pool of publicly available training data for AI. While major museums like the Hermitage or the Louvre have published photographs of many artworks from their collections online, smaller museums lack even amateur photos. For example, you won’t find online images of works by Vrubel or Vasnetsov from Abramtsevo, even though Abramtsevo was a hub for many artists. Similarly, you won’t find photos of fresco fragments or icons from Sergiev Posad, even though the masterpieces of 17th-century artists depicting biblical scenes are no less impressive than Renaissance works, even if created a century later. The same issue exists in Italy—there are too few publicly available photos of Ravenna’s mosaics. The list could go on endlessly. Preparing datasets requires more photographic material. We trained our AI on data we collected over years. For instance, I worked as an art critic for a decade, attending two to three exhibitions daily, photographing everything, and archiving the images. This extensive database proved invaluable in our work.
Yes, artificial intelligence can detect obvious forgeries. A recent case from our practice involved a sculpture attributed to the renowned artist Pavel Trubetskoy. The sculpture bore two marks—one indicating the casting date, 1899, and another from the French foundry Claude Valsuani. However, it is known that in 1899, Trubetskoy lived and worked in Moscow, casting sculptures in his own workshop. He only moved to Paris in 1906 and, even then, did not use Claude Valsuani for casting. The Claude Valsuani mark on the fake also differed from those on authenticated museum pieces, including the style of the letter R (with and without serifs). Artificial intelligence conducts in-depth research and cross-references facts—yes, it has been trained to recognize marks and signatures.
No. AI can be trained to perform deep searches and analyze data from open sources and internal databases. Let’s return to the Trubetskoy example. What does an art experts do today if they don’t remember specific details or dates? They search Google for examples of foundry marks used on Trubetskoy’s sculptures, open each website, then look up the years he worked in Moscow and later in Paris to cross-check the facts. This process takes a lot of time. AI reduces the time art experts spend on research—you upload an image, and in seconds, the AI performs the tasks it was trained to do. In practice, this helps with pre-sale work. An AI application integrated with a CRM can filter inquiries for potential fakes. Art experts or managers can then prioritize, sorting through submissions and focusing on the most critical ones. The market for expert services is very niche, and a single specialist may receive anywhere from 50 to 200 inquiries per day.
Artificial intelligence won’t replace art experts, but it can significantly reduce the time spent working with archival materials, comparing them, and conducting analysis. I’d like to focus specifically on archives. I no longer need to visit each library or museum website to compile a list of relevant literature—AI generates this list in seconds and even provides direct links to download digital versions of the books. What used to take five hours of research now takes mere seconds.
Artificial intelligence will remain an auxiliary tool. To determine the authenticity of artworks, one must examine them in person – especially in cases where it’s not obvious from photographs, even high-quality ones. There are instances when an art expert can immediately identify a forgery without lengthy research.
As an expert in cultural property myself who regularly interacts with other specialists, I’ve observed completely varied reactions – as is typical with any innovation, it all depends on the individual and their background. The most fascinating discussions I’ve had were with PhDs – university professors. They actively want graduate students to pursue dissertation topics focused on artificial intelligence. Art authentication through AI is one such promising area. Humanities departments are particularly anticipating research that develops methodologies for this type of authentication.
The algorithm was specifically trained to recognize fonts and handwriting styles. To achieve this, we compiled datasets containing font libraries (including ancient ones) and collections of artists’ handwriting samples. The AI analyzes each letter individually, identifying similarities and differences. It examines various characteristics: letter shapes, presence of serifs, spatial orientation of strokes, handwriting fluency, rhythm and coordination of movements, letter connections, and writing pressure – all detectable through computer vision. The system can also identify eclectic writing styles typical of provincial iconography, such as cases where inscriptions on icons combine letterforms from different historical periods.
To train the AI, we utilized image and data parsing from open sources while specifically teaching the AI search algorithms to uncover sources not visible through Google or Yandex. We also incorporated our own materials – self-compiled libraries of artist signatures and image collections gathered through firsthand museum expeditions.
The AI examines each letter individually, analyzing similarities and differences through multiple parameters: letter shapes, presence of serifs, spatial orientation of strokes, handwriting fluency characteristics, rhythm and coordination of movements, distinctive letter connections, and writing pressure – all detectable through computer vision technology.
Yes, there have been surprises. One notable case involved a handwriting analysis – both with AI assistance and subsequent traditional examination – that confirmed Édouard Manet’s signature on the reverse of a graphic artwork. The analysis also verified the work’s creation period as the 19th century. However, comparative and stylistic analysis of the artwork itself contradicts Manet’s authorship. A final decision regarding the work’s authenticity has not yet been reached, as research is still ongoing.
In the case of Édouard Manet’s graphic artwork, both artificial intelligence and a subsequent conventional handwriting examination independently confirmed the authenticity of the signature.
Over 500 samples of Édouard Manet’s handwriting were identified in publicly available low-resolution photographs. Additionally, our team conducted original museum photography to capture 10 additional reference specimens.
The AI primarily focuses on analyzing period-specific alphabet characteristics (as alphabets naturally evolve over time) and cross-referencing letterforms with authenticated reference samples.
AI can accurately compare signatures, marks, and fonts. It can also date objects by analyzing letter shapes. We plan to train it in calligraphy too. But AI can’t replace art experts. Only humans can do proper style comparisons and full technical analysis. AI can identify pigments from microscope photos. But results may be unclear. For example, it might say a pigment was used in both the 1500s and 1900s – which could be true. Art experts review all results. They use AI data about pigments and signatures. But they make the final judgment.
Yes, always. This is necessary at the machine learning stage.
I lead the development of specialized training datasets for our AI-powered art authentication system. My work focuses on compiling and curating comprehensive visual references that enable machine learning models to analyze multiple authentication factors. For period verification, we assemble thousands of dated artwork images showing stylistic evolution across movements and schools. For authorship attribution, we create detailed signature banks with hundreds of authenticated examples per artist, supplemented by technical image libraries documenting characteristic brushwork patterns, pigment combinations, and material applications specific to each creator. The datasets incorporate microscopic analyses of paint layers, multispectral imaging data, and provenance documentation where available. We continuously expand these resources through museum partnerships, archival research, and controlled photography sessions of verified masterpieces.
Our AI art authentication application utilizes specially compiled reference libraries containing artists’ signatures, pattern databases, and categorized artwork photographs organized by both historical period and the specific genres each artist worked in. These comprehensive collections enable systematic analysis of authentic characteristics across an artist’s body of work. Also we developed dedicated pigment, binder, and material reference libraries by compiling and analyzing previously published scientific research on historical art materials. These specialized datasets draw upon established art technological studies and conservation science literature to create comprehensive material profiles for authentication purposes.
Our AI art authentication system follows a rigorous data quality and machine learning process with a clear development roadmap. We initially focused on artists like Marc Chagall who have tens of thousands of publicly available verified materials. The system automatically collects and processes all accessible documentation about each artist from trusted sources, supplemented by our original photographic documentation.
Through continuous testing, we refine our algorithms to optimize artwork recognition accuracy. The AI pre-selects and categorizes sources (e.g., “Museum” or “Auction”), automatically filtering out unreliable ones like reproduction shops. All sources undergo additional verification by art experts, with a priority ranking system that favors museum collections and our original photography over auction records (some of which get flagged as questionable).
We’ve also developed specialized reference libraries of known forgeries and created detailed authentication checklists that our algorithms must complete during analysis. This multi-layered approach combines AI efficiency with expert validation to ensure reliable results.
We develop customized methodologies for building artist-specific datasets, as each case presents unique challenges related to both the artistic legacy itself and available photographic documentation. Take Natalia Goncharova as an example – while her childhood and youthful sketches survive in museum collections, none have been definitively authenticated.
When examining a newly discovered early sketch attributed to Goncharova, authenticators face a dilemma. Without verified reference works, they must compare it to other similarly questionable pieces rather than confirmed exemplars. The analysis becomes even more complex when considering alternative attributions – if not by Goncharova, then by whom? This requires compiling comparative materials from unknown artists with matching stylistic characteristics.
These exact challenges define the specialized process of dataset preparation. Our methodology must account for such ambiguities by creating multi-layered reference systems that include both confirmed works and properly catalogued attribution uncertainties, enabling the AI to navigate these art historical complexities.
A background in art history is essential when developing the training and testing methodology for AI algorithms used in art authentication. This expertise ensures proper interpretation of artistic styles, techniques, and historical contexts that machines alone cannot fully comprehend. Art historians provide the critical framework that guides how AI systems analyze brushwork, materials, and other authentication factors while accounting for art historical nuances. Their knowledge shapes both what the AI learns and how its conclusions should be evaluated, bridging the gap between technical analysis and connoisseurship. Without this specialized human expertise, AI systems would lack the necessary art historical foundation to make reliable authentication judgments.
Our data collection combined two key approaches: parsing and ranking publicly available sources using a priority system we developed, and building proprietary databases from our original photographic materials. The databases store extensive artwork profiles – each piece can have over 50 distinct parameters carefully defined by art experts. Our programming team then created specialized algorithms to analyze and cross-reference these parameters for authentication purposes.
Working with Nina Moleva’s collection presented numerous fascinating aspects. Let me recount her background – Nina Moleva was an art historian who taught art history at Moscow State University. Her husband, Eliy Belyutin, was a renowned abstract artist, though his expertise extended far beyond abstraction. He possessed deep knowledge of art history, taught academic drawing, and experimented with various styles and movements. The couple collaborated with conservator Alexey Rybnikov, who helped preserve and pack their collection during challenging times in the Soviet era.
This raises intriguing questions: Could these three specialists have formed a remarkable trio of art forgers? Quite possibly, even during Soviet times. Might Belyutin have practiced imitating other artists’ works as a hobby or learning exercise in his early career? Certainly.
However, the collection’s condition presents additional challenges. Despite Rybnikov’s conservation efforts, many works show poor preservation. Visual examination reveals traces of local fauna – mouse droppings, spider webs, and cockroach remains – representing the most extensive biological contamination we’ve encountered. Some pieces may require re-evaluation after professional cleaning to remove these organic deposits.
We vollected both images and metadata.
10 000+
Classification, clustering, decision trees, Markov processes (modeling event sequences and forecasting), reinforcement learning, regression, deep learning, transfer learning. Each algorithm has its strengths and weaknesses. The choice of algorithm depends on the specific task. A combination of different algorithms is often used. Efficiency depends on the quality and volume of data.
CNN (Convolutional Neural Networks), RNN (Recurrent Neural Networks), LSTM (Long Short-Term Memory). Support Vector Machines (SVM): feature classification, detection of boundaries between classes. Image Processing Methods: Hough Transform – detection of geometric primitives; Scale-invariant Fourier Transform (SIFT) – keypoint detection; FAST – signature corner detection; Canny – edge detection.
Distance matrix — creation of an invariant matrix that takes into account shift, rotation, and scale. Extremum method — search for correspondence between points of local maxima and minima. Hidden Markov model — analysis of a sequence of signature elements. Pyramidal representation — division of a signature into sections with construction of ellipses of inertia. Approximation by Bezier curves — construction of curves based on signature points. Elliptic primitives — description of signature sections through geometric figures. Convolutional networks (CNN) — analysis of spatial features of a signature. Recurrent networks (RNN) — processing a sequence of writing points. LSTM networks — consideration of the time component during comparison.
Accuracy
Formula: Accuracy = \frac{TP + TN}{TP + TN + FP + FN}
Shows the proportion of correct results among all
Used for balanced data
Precision
Formula: Precision = \frac{TP}{TP + FP}
Shows the proportion of correctly classified positive cases among all predicted positive
Important when it is necessary to minimize false positives
Recall
Formula: Recall = \frac{TP}{TP + FN}
Shows the proportion of correctly found positive cases among all real positive
Critical when gaps are unacceptable
F1-score (Balanced metric)
Formula: F1 = \frac{2 \times Precision \times Recall}{Precision + Recall}
Harmonic mean between accuracy and recall
Used for unbalanced data
ROC-AUC
Evaluates the quality of the classifier regardless of the threshold
Shows the balance between True Positive Rate and False Positive Rate
MSE (Mean Squared Error)
Formula: MSE = \frac{1}{N}\sum_{i=1}^{N}(y_i – \hat{y}_i)^2
Measures the mean squared difference between the actual and predicted values
MAE (Mean Absolute Error)
Measures the mean absolute deviation
Less sensitive to outliers
MAPE (Mean Absolute Percentage Error)
Measures the mean absolute percentage error
Convenient for comparing models at different scales
R² (Determination Coefficient)
Shows the proportion of variance explained by the model
Close to 1 for a good model
Confusion Matrix
A table showing all types of classification errors
Helps to analyze behavior in detail models
Log Loss
Estimates the confidence of the model in the results
Important when working with probabilistic models
Yes, we are working on algorithms to restore lost images. The results largely depend on the source information. The less of it, the broader the interpretation of the results. If there are few references, the AI will not restore the original image, but will create a completely new one, as Midjourney does, accessing libraries with low priority.
There are closed databases of museums and laboratories that we would like to connect to. But there is no legal basis for this. Museum lawyers have not previously encountered agreements similar to ours and do not understand how to work with them. There is also the problem of the high cost of such a connection. On average, $100 for using data on one work. And we need hundreds of thousands. Negotiations with copyright holders are ongoing.