17:05:00 RRSAgent has joined #publishingcg 17:05:04 logging to https://www.w3.org/2025/05/15-publishingcg-irc 17:05:45 scribe: wolfgang 17:06:21 gautier: Talk of Senthil (Ailaysa, Chennai) - we are taking notes 17:06:46 presentmiia 17:06:51 present+ miia 17:07:06 senthil: speak about the concept, then provide a demo, then Q & A 17:09:07 ... Senthil Nathan from Ailaysa - AI company - content translation based on AI - taking content in different languages - international book fair in Chennai - introduced products into publsihing - before mainly translation/localization - automatic translations using AI 17:13:06 ... concepts: how to develop a responsible content in an AI context - we cannot have walled days - great data rush for training AI systems without knowledge and permission of owners - awareness that quality content is very important for AI - quality data should come from publishers, media companies, research institutes - shifting to being active 17:13:06 negotiators 17:15:45 ... Content exclusion of content as training data - in case of use responsible usage + permission needed - in 2024 ppl are actively discussing - should be a fair deal with proper compensation - illegal scraping was a big problem - is coming to an end - much more reduced now 17:17:55 ... terms of permission are set by both parties - technical barriers can now be easily implemented - clear legal terms prohibiting use without limits - content watermarking and provenance tracking tools 17:20:31 ... to include: fair licensing terms - mandatory source citations in AI output - quality control: selective participation with responsible AI companies - usage tracking: monitoring how content influences AI responses - consent frameworks: granular control over AI uses 17:21:16 ... factors: technical, business, regulatory and market dynamics 17:23:03 Michalis has joined #publishingcg 17:23:19 ... AI-specific exclusion protocols (better than robots.txt) - rise of new AI-crawlers (require new blocking mechanisms) - dynamic paywalls and anti-scraping tech - emergence for content-tracking tools 17:24:55 ... blockers (NYT, Guardian) vs. partners (Axel Springer with OpenAI) vs. open access (But seeking attribution) vs. wait-and-see 17:26:07 ... EU: Ai-Act - US: considering legal framework - courses of copyright offices 17:28:23 ... market: growing need for high-quality content - AI is not thinking, algorithmic, not creative - publishers see new revenue streams via partnerships - data brokers like literary agency - syndication rights 17:29:22 ... principle of fair monetization - important to track extent of usage and kinds of usage 17:31:45 ... from authoring to reading: AI environment is set - book discovery enhanced through LLM recommendation and search systems - going beyond metadata and keywords: asking questions on the contents of the book (e.g. ChaiReader) 17:33:37 ... future options: read book in another language such as Tamil thx to automatic translation or as audiobook - in libraries, bookstores, schools use of books may be changed - 17:34:58 ... HarperCollins works with MS, also Sage, CUP, 17:34:58 ... have to find common ground between publishers and AI companies 17:46:50 Demo Chai Reader: Reading, Chatting and Buying in one portal - multilingual Q&A - buy routine integrated - in future: book recommendations based on search terms - translation of a book into a target language 17:47:10 q+ 17:48:39 gautier: when I'm chatting with a book, answers only from book content - LLM only used to prepare a nice answer - not training each book in LLM - 17:48:59 Senthil: completely separated 17:49:24 ack Michalis 17:50:36 michalis: concerned that access to content should be fair use - esp. in the US -next months will be critical in legal aspects 17:51:18 senthil: big publishers have great interest - different for small publishers or even authors - 17:51:47 michalis: in education or academic this would be quite useful 17:54:57 senthil: exactly useful to expolore several books in parallel to formulate an answer - we work with EDRLabs to improve on it - ChaiReader still in Beta - working with publishers - can chat with a collection of books, not only one at the same time - impact of "AI on economics" - reasoning capacity - more important than just referring back - great 17:54:58 thing for book 17:55:03 discovery 17:55:47 ivan: aren't you forced to make some sort of ranking between books consumed - need a local ranking for books you have 17:56:08 senthil: possible to rank or categorize dependent on prompting 17:57:37 vishal: the more correct the prompt, the more precise the answer will be - if 3 books have an answer - semantic ranking combined with keyword level ranking - still experimental feature - as Google and Amazon do 17:58:56 ivan: in some cases this is not the best answer - in scholarly usage - ranking by systems outside your bookshop - based on reputation of answers - you use LLM only for niceties of input and output 18:00:23 vishal: reinforcement learning - librarian knows the authors - deepseek uses this feature - integrate human expertise into machine 18:00:42 senthil: good question 18:01:59 RSSAgent make minutes 18:02:42 rrsagent, draft minutes 18:02:43 I have made the request to generate https://www.w3.org/2025/05/15-publishingcg-minutes.html wolfgang 18:03:42 Zakimrssagent, make minutes 18:04:24 rssagent, make minutes 18:04:41 rrsagent, make logs public 18:05:58 rssagent, set meeting title to publishing community group 18:06:46 rssagent, set meeting title to "publishing community group"