{"id":20335,"date":"2024-03-06T04:05:14","date_gmt":"2024-03-06T04:05:14","guid":{"rendered":"https:\/\/interface.media\/?p=20335"},"modified":"2024-03-13T12:48:26","modified_gmt":"2024-03-13T12:48:26","slug":"small-language-models-could-make-generative-ai-more-ethical-and-better","status":"publish","type":"post","link":"https:\/\/interface.media\/blog\/2024\/03\/06\/small-language-models-could-make-generative-ai-more-ethical-and-better\/","title":{"rendered":"Small language models could make generative AI more ethical"},"content":{"rendered":"\n<p>The emergence of sophisticated generative artificial intelligence (AI) applications\u2014including image generators like Midjourney and conversational chatbots like OpenAI\u2019s Chat-GPT\u2014has sent shockwaves through the economy and popular culture in equal measure. The technology,&nbsp; made accessible to a massive audience in a short span of time, has attracted immense interest, <a href=\"https:\/\/interface.media\/blog\/executiveinsights\/state-of-vermont-using-ai-for-good\/\">investment<\/a>, and <a href=\"https:\/\/interface.media\/blog\/2024\/03\/06\/generative-ai-is-creating-hew-headaches-for-cybersecurity-teams\/\">controversy<\/a>.&nbsp;However, the data used to train large language models  <\/p>\n\n\n\n<p>Aside from criticisms rooted in <a href=\"https:\/\/techpolicy.press\/taylor-swift-shows-us-whats-coming-next-in-gender-and-tech-and-advocates-should-be-concerned\">the role played by generative AI in creating sexually explicit deepfakes<\/a> of Taylor Swift, spreading misinformation, and enforcing prejudicial biases, the most prominent controversy surrounding the technology stems from the legal and ethical issues relating to the data used to train large language models (LLMs).<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-generative-ai-large-language-models-on-unstable-ethical-ground\">Generative AI large language models on unstable ethical ground  <\/h3>\n\n\n\n<p>According to Chat-GPT 3.5 itself, LLMs are \u201ctrained on a vast dataset of text from various sources, including books, articles, websites, and other publicly available written material. This data helps us learn patterns and structures of language to generate responses and assist users.\u201d&nbsp;<\/p>\n\n\n\n<p>Essentially, an LLM scrapes billions of lines of text from across the internet in order to train its learning model. Because generative AI consumes so much information, it can convincingly mimic, response, and \u201ccreate\u201d responses based on the data it has examined. However, authors, journalists, and several news organisations have raised concerns. The issue they highlight is that an LLM scraping content written by human authors is, in effect, uncredited and unpaid use of those writers\u2019 work.\u00a0<\/p>\n\n\n\n<p>Chat-GPT generates the response that \u201cwhile large language models learn from existing text, they do so within legal and ethical boundaries, aiming to respect intellectual property rights and promote responsible usage.\u201d&nbsp;<\/p>\n\n\n\n<p>A statement by to the <a href=\"https:\/\/europeanwriterscouncil.eu\/gai-is-based-on-theft\/\">European Writers\u2019 Council<\/a> contradicts the claim. \u201cAlready, numerous criminal and damaging \u201cAI business models\u201d have developed in the book sector \u2013 with fake authors, fake books and also fake readers,&#8221; the council says in a letter. &#8220;The fundamental process of developing large language models such as GPT, Meta, StableLM, and BERT rest on using uncredited copyrighted work. These works, asserts the Council, are sourced from &#8220;shadow libraries such as Library Genesis (LibGen), Z-Library (Bok), Sci-Hub and Bibliotik \u2013 piracy websites.\u201d\u00a0\u00a0<\/p>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"h-more-ethical-generative-ai-start-by-thinking-smaller\">More ethical generative AI? Start by thinking smaller<\/h3>\n\n\n\n<p>AI developers train the most publicly visible forms of generative AI, like Chat-GPT and Midjourney, using billions of parameters. Therefore, these large language models need to crawl the web for every possible scrap of information in order to build up the quality of their responses. However, several recent developments in generative AI are \u201cchallenging the notion that scale is needed for performance.\u201d\u00a0<\/p>\n\n\n\n<p>For example, the most recent version of OpenAI\u2019s engine, Chat-GPT-4, operates using 1.5 billion parameters. That might sound like a lot, but the previous version, <a href=\"https:\/\/medium.com\/@mlubbad\/the-ultimate-guide-to-gpt-4-parameters-everything-you-need-to-know-about-nlps-game-changer-109b8767855a\">GPT-3.5, uses 175 billion<\/a>.&nbsp;<\/p>\n\n\n\n<p>Large language models are, one generation at a time, shrinking in size while their performance improves. Microsoft has created two small language models (SLMs) called <a href=\"https:\/\/news.microsoft.com\/three-big-ai-trends-to-watch-in-2024\/\">Phi and Orca<\/a> which, under certain circumstances, outperform large language models.&nbsp;<\/p>\n\n\n\n<p>Unlike earlier generations\u2014trained on vast diets of disorganised, unvetted data\u2014SLMs use \u201ccurated, high-quality training data\u201d according to <a href=\"https:\/\/www.linkedin.com\/in\/vanessahoseattle\">Vanessa Ho from Microsoft<\/a>. <\/p>\n\n\n\n<p>They are more specific in scope, use <a href=\"https:\/\/www.ibm.com\/blog\/artificial-intelligence-trends\/#_edn6\">less computing power<\/a> (and therefore less energy\u2014another relevant criticism of generative AI models), and could produce more reliable results when trained with the right data\u2014potentially making them more useful from a business point of view. In 2022, <a href=\"https:\/\/huggingface.co\/papers\/2203.15556\">Deepmind demonstrated<\/a> that training smaller models on more data yields better performance than training larger models on fewer data.&nbsp;<\/p>\n\n\n\n<p>AI needs to find a way of escaping its ethically dubious beginnings if the technology is to live up to its potential. The transition from large language models to smaller, higher quality data training sets would be a valuable step in the right direction. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Small Language Model AI trained on more data has the potential to be more ethical than large models trained on less information. <\/p>\n","protected":false},"author":480,"featured_media":20336,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"apple_news_api_created_at":"2024-03-06T04:05:18Z","apple_news_api_id":"1bf32b03-e66c-4beb-bdef-a0b0d6370815","apple_news_api_modified_at":"2024-03-13T12:48:24Z","apple_news_api_revision":"AAAAAAAAAAAAAAAAAAAAAQ==","apple_news_api_share_url":"https:\/\/apple.news\/AG_MrA-ZsS-u976Cw1jcIFQ","apple_news_cover_media_provider":"image","apple_news_coverimage":0,"apple_news_coverimage_caption":"","apple_news_cover_video_id":0,"apple_news_cover_video_url":"","apple_news_cover_embedwebvideo_url":"","apple_news_is_hidden":"","apple_news_is_paid":"","apple_news_is_preview":"","apple_news_is_sponsored":"","apple_news_maturity_rating":"","apple_news_metadata":"\"\"","apple_news_pullquote":"","apple_news_pullquote_position":"","apple_news_slug":"","apple_news_sections":[],"apple_news_suppress_video_url":false,"apple_news_use_image_component":false,"footnotes":""},"categories":[3],"tags":[],"topic":[614],"class_list":["post-20335","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-the-interface","topic-data-ai"],"acf":[],"apple_news_notices":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v26.6 (Yoast SEO v26.6) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Small language models could make generative AI more ethical - Interface<\/title>\n<meta name=\"description\" content=\"Small Language Model AI trained on more data has the potential to be more ethical than large models trained on less information.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<meta property=\"og:locale\" content=\"en_GB\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Small language models could make generative AI more ethical\" \/>\n<meta property=\"og:description\" content=\"Small Language Model AI trained on more data has the potential to be more ethical than large models trained on less information.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/interface.media\/blog\/2024\/03\/06\/small-language-models-could-make-generative-ai-more-ethical-and-better\/\" \/>\n<meta property=\"og:site_name\" content=\"Interface\" \/>\n<meta property=\"article:published_time\" content=\"2024-03-06T04:05:14+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-03-13T12:48:26+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/interface.media\/wp-content\/uploads\/sites\/3\/2024\/03\/iStock-1439778251-e1709694760120.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1778\" \/>\n\t<meta property=\"og:image:height\" content=\"1330\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Dan Brightmore\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Dan Brightmore\" \/>\n\t<meta name=\"twitter:label2\" content=\"Estimated reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebPage\",\"@id\":\"https:\/\/interface.media\/blog\/2024\/03\/06\/small-language-models-could-make-generative-ai-more-ethical-and-better\/\",\"url\":\"https:\/\/interface.media\/blog\/2024\/03\/06\/small-language-models-could-make-generative-ai-more-ethical-and-better\/\",\"name\":\"Small language models could make generative AI more ethical - Interface\",\"isPartOf\":{\"@id\":\"https:\/\/interface.media\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/interface.media\/blog\/2024\/03\/06\/small-language-models-could-make-generative-ai-more-ethical-and-better\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/interface.media\/blog\/2024\/03\/06\/small-language-models-could-make-generative-ai-more-ethical-and-better\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/interface.media\/wp-content\/uploads\/sites\/3\/2024\/03\/iStock-1439778251-e1709694760120.jpg\",\"datePublished\":\"2024-03-06T04:05:14+00:00\",\"dateModified\":\"2024-03-13T12:48:26+00:00\",\"author\":{\"@id\":\"https:\/\/interface.media\/#\/schema\/person\/7c33499ca8e42b097028109cccb22748\"},\"description\":\"Small Language Model AI trained on more data has the potential to be more ethical than large models trained on less information.\",\"breadcrumb\":{\"@id\":\"https:\/\/interface.media\/blog\/2024\/03\/06\/small-language-models-could-make-generative-ai-more-ethical-and-better\/#breadcrumb\"},\"inLanguage\":\"en-GB\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/interface.media\/blog\/2024\/03\/06\/small-language-models-could-make-generative-ai-more-ethical-and-better\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\/\/interface.media\/blog\/2024\/03\/06\/small-language-models-could-make-generative-ai-more-ethical-and-better\/#primaryimage\",\"url\":\"https:\/\/interface.media\/wp-content\/uploads\/sites\/3\/2024\/03\/iStock-1439778251-e1709694760120.jpg\",\"contentUrl\":\"https:\/\/interface.media\/wp-content\/uploads\/sites\/3\/2024\/03\/iStock-1439778251-e1709694760120.jpg\",\"width\":1778,\"height\":1330,\"caption\":\"Contemporary art collage. Conceptual image. Young motivated woman running to computer monitor with many books. Concept of education, online studying, knowledge development, information, growth\"},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/interface.media\/blog\/2024\/03\/06\/small-language-models-could-make-generative-ai-more-ethical-and-better\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/interface.media\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Small language models could make generative AI more ethical\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/interface.media\/#website\",\"url\":\"https:\/\/interface.media\/\",\"name\":\"Interface\",\"description\":\"Delivering World Class Content \u201cFrom Executive, For Executive\u201c\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/interface.media\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-GB\"},{\"@type\":\"Person\",\"@id\":\"https:\/\/interface.media\/#\/schema\/person\/7c33499ca8e42b097028109cccb22748\",\"name\":\"Dan Brightmore\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-GB\",\"@id\":\"https:\/\/interface.media\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/e9ca282f0ef431735a64685769ad57886e24b074c4c58314392755fb79164164?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/e9ca282f0ef431735a64685769ad57886e24b074c4c58314392755fb79164164?s=96&d=mm&r=g\",\"caption\":\"Dan Brightmore\"},\"url\":\"https:\/\/interface.media\/blog\/author\/dbrightmore\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Small language models could make generative AI more ethical - Interface","description":"Small Language Model AI trained on more data has the potential to be more ethical than large models trained on less information.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"og_locale":"en_GB","og_type":"article","og_title":"Small language models could make generative AI more ethical","og_description":"Small Language Model AI trained on more data has the potential to be more ethical than large models trained on less information.","og_url":"https:\/\/interface.media\/blog\/2024\/03\/06\/small-language-models-could-make-generative-ai-more-ethical-and-better\/","og_site_name":"Interface","article_published_time":"2024-03-06T04:05:14+00:00","article_modified_time":"2024-03-13T12:48:26+00:00","og_image":[{"width":1778,"height":1330,"url":"https:\/\/interface.media\/wp-content\/uploads\/sites\/3\/2024\/03\/iStock-1439778251-e1709694760120.jpg","type":"image\/jpeg"}],"author":"Dan Brightmore","twitter_card":"summary_large_image","twitter_misc":{"Written by":"Dan Brightmore","Estimated reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"WebPage","@id":"https:\/\/interface.media\/blog\/2024\/03\/06\/small-language-models-could-make-generative-ai-more-ethical-and-better\/","url":"https:\/\/interface.media\/blog\/2024\/03\/06\/small-language-models-could-make-generative-ai-more-ethical-and-better\/","name":"Small language models could make generative AI more ethical - Interface","isPartOf":{"@id":"https:\/\/interface.media\/#website"},"primaryImageOfPage":{"@id":"https:\/\/interface.media\/blog\/2024\/03\/06\/small-language-models-could-make-generative-ai-more-ethical-and-better\/#primaryimage"},"image":{"@id":"https:\/\/interface.media\/blog\/2024\/03\/06\/small-language-models-could-make-generative-ai-more-ethical-and-better\/#primaryimage"},"thumbnailUrl":"https:\/\/interface.media\/wp-content\/uploads\/sites\/3\/2024\/03\/iStock-1439778251-e1709694760120.jpg","datePublished":"2024-03-06T04:05:14+00:00","dateModified":"2024-03-13T12:48:26+00:00","author":{"@id":"https:\/\/interface.media\/#\/schema\/person\/7c33499ca8e42b097028109cccb22748"},"description":"Small Language Model AI trained on more data has the potential to be more ethical than large models trained on less information.","breadcrumb":{"@id":"https:\/\/interface.media\/blog\/2024\/03\/06\/small-language-models-could-make-generative-ai-more-ethical-and-better\/#breadcrumb"},"inLanguage":"en-GB","potentialAction":[{"@type":"ReadAction","target":["https:\/\/interface.media\/blog\/2024\/03\/06\/small-language-models-could-make-generative-ai-more-ethical-and-better\/"]}]},{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/interface.media\/blog\/2024\/03\/06\/small-language-models-could-make-generative-ai-more-ethical-and-better\/#primaryimage","url":"https:\/\/interface.media\/wp-content\/uploads\/sites\/3\/2024\/03\/iStock-1439778251-e1709694760120.jpg","contentUrl":"https:\/\/interface.media\/wp-content\/uploads\/sites\/3\/2024\/03\/iStock-1439778251-e1709694760120.jpg","width":1778,"height":1330,"caption":"Contemporary art collage. Conceptual image. Young motivated woman running to computer monitor with many books. Concept of education, online studying, knowledge development, information, growth"},{"@type":"BreadcrumbList","@id":"https:\/\/interface.media\/blog\/2024\/03\/06\/small-language-models-could-make-generative-ai-more-ethical-and-better\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/interface.media\/"},{"@type":"ListItem","position":2,"name":"Small language models could make generative AI more ethical"}]},{"@type":"WebSite","@id":"https:\/\/interface.media\/#website","url":"https:\/\/interface.media\/","name":"Interface","description":"Delivering World Class Content \u201cFrom Executive, For Executive\u201c","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/interface.media\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-GB"},{"@type":"Person","@id":"https:\/\/interface.media\/#\/schema\/person\/7c33499ca8e42b097028109cccb22748","name":"Dan Brightmore","image":{"@type":"ImageObject","inLanguage":"en-GB","@id":"https:\/\/interface.media\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/e9ca282f0ef431735a64685769ad57886e24b074c4c58314392755fb79164164?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/e9ca282f0ef431735a64685769ad57886e24b074c4c58314392755fb79164164?s=96&d=mm&r=g","caption":"Dan Brightmore"},"url":"https:\/\/interface.media\/blog\/author\/dbrightmore\/"}]}},"_links":{"self":[{"href":"https:\/\/interface.media\/wp-json\/wp\/v2\/posts\/20335","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/interface.media\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/interface.media\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/interface.media\/wp-json\/wp\/v2\/users\/480"}],"replies":[{"embeddable":true,"href":"https:\/\/interface.media\/wp-json\/wp\/v2\/comments?post=20335"}],"version-history":[{"count":3,"href":"https:\/\/interface.media\/wp-json\/wp\/v2\/posts\/20335\/revisions"}],"predecessor-version":[{"id":20438,"href":"https:\/\/interface.media\/wp-json\/wp\/v2\/posts\/20335\/revisions\/20438"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/interface.media\/wp-json\/wp\/v2\/media\/20336"}],"wp:attachment":[{"href":"https:\/\/interface.media\/wp-json\/wp\/v2\/media?parent=20335"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/interface.media\/wp-json\/wp\/v2\/categories?post=20335"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/interface.media\/wp-json\/wp\/v2\/tags?post=20335"},{"taxonomy":"topic","embeddable":true,"href":"https:\/\/interface.media\/wp-json\/wp\/v2\/topic?post=20335"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}