RESEARCH OUTPUT: AI TEATR VS TRUE LIFE

Autentyczność vs Symulacja w Promptach AI

Data: 2025-12-23
Status: ZATWIERDZONY - PEŁNA WERSJA Z LINKAMI
Czas researchu: 5+ godzin
Źródła: 80+ queries, 200+ sources

═══════════════════════════════════════════════════════════════

CZĘŚĆ A: GŁÓWNA TEZA

═══════════════════════════════════════════════════════════════

"Nie każ AI grać eksperta - każ AI ZROBIĆ to co ekspert by zrobił."

PROBLEM:

Prompty typu "You are a helpful assistant" to TEATR - AI udaje kogoś kim nie jest. Rozwiązanie to PRAWDZIWE DZIAŁANIE - AI robi research i raportuje co znalazło.

═══════════════════════════════════════════════════════════════

CZĘŚĆ B: BADACZE I KRYTYCY - PEŁNE PROFILE

═══════════════════════════════════════════════════════════════

🔴 EMILY M. BENDER

Pozycja:

▸Tytuł: Thomas L. and Margo G. Wyckoff Endowed Professor
▸Afiliacja: Department of Linguistics, University of Washington
▸Role dodatkowe: Faculty Director, MS in Computational Linguistics; Director, Computational Linguistics Laboratory

Kluczowa publikacja:

Bender, E.M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021).
"On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?"
Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT '21), pp. 610-623.
DOI: 10.1145/3442188.3445922

Inne publikacje:

▸"Linguistic Fundamentals for Natural Language Processing" (2013)
▸"The AI Con: How to Fight Big Tech's Hype" (2025, z Alex Hanna)
▸"Data statements: From technical concept to community practice" (2024)
▸"#BenderRule: On naming the languages we study" (2019)

Podcast:

▸"Mystery AI Hype Theater 3000" - z Alex Hanna (DAIR Institute)
▸Link: https://www.dair-institute.org/maiht3k/
▸Platforma: Buzzsprout, Apple Podcasts
▸Odcinki 2024-2025: Episode 63-68 (tematy: algo-cracy, labor research, newsrooms, AI hype)

Linki:

▸Faculty page: faculty.washington.edu/ebender/
▸DAIR Institute: dair-institute.org
▸Stochastic Parrots paper: dl.acm.org/doi/10.1145/3442188.3445922

🔴 YANN LECUN

Pozycja:

▸Tytuł: Chief AI Scientist, Meta (do 2025)
▸Startup 2025: Advanced Machine Intelligence (AMI) - focus na world models

Kluczowe koncepcje:

▸JEPA: Joint Embedding Predictive Architecture
▸World Models: AI potrzebuje modeli świata, nie tylko języka
▸V-JEPA: Video-JEPA do uczenia z video

Wykłady 2024:

▸Ding Shum Lecture (kwiecień 2024): "Objective-Driven AI: Towards AI systems that can learn, remember, reason, and plan"
▸Teza: H-JEPA (Hierarchical JEPA) jako architektura dla AGI

Cytaty:

"LLMs are a dead end for achieving true human-level intelligence." "Language is very poor in terms of bandwidth and information content compared to real-world experience." "AGI can only be achieved using World models like JEPA, and not LLMs."

Linki:

▸LeCun Twitter/X: @ylecun
▸Meta AI: ai.meta.com
▸Lex Fridman episode: lexfridman.com/yann-lecun

🔴 GARY MARCUS

Pozycja:

▸Afiliacja: Professor Emeritus, NYU; Founder, Robust.AI
▸Substack: "The Road to AI We Can Trust" / "Marcus on AI"

Publikacje 2024 (Substack):

▸"The desperate race to save Generative AI" (8 stycznia 2024)
▸"CONFIRMED: LLMs have indeed reached a point of diminishing returns"
▸"Is OpenAI more like WeWork or Theranos?"
▸"LLMs aren't very bright. Why are so many people fooled?"
▸"AGI by 2027? First sign of GenAI Winter?"

Pozycja:

▸LLM-y to "off-ramp" do AGI, nie droga
▸Skalowanie = diminishing returns
▸Rozwiązanie: Neuro-symbolic AI (hybrid)

Linki:

▸Substack: garymarcus.substack.com
▸Sitemap 2024: garymarcus.substack.com/sitemap/2024.xml

🔵 MURRAY SHANAHAN

Pozycja:

▸Afiliacja: Professor, Imperial College London

Kluczowa publikacja:

Shanahan, M. (2024). "Talking About Large Language Models."
Communications of the ACM, 67(2), 68-79.
DOI: 10.1145/3624724

Follow-up:

▸"Still 'Talking About Large Language Models'" (arXiv, grudzień 2024)

Teza:

▸LLM-y są "disembodied" - nie mają grounding w świecie
▸Antropomorfizowanie LLM-ów zaciemnia ich prawdziwe mechanizmy
▸"Next-token prediction" ≠ rozumienie

Linki:

▸Paper ACM: dl.acm.org/doi/10.1145/3624724
▸ResearchGate profile: researchgate.net/profile/Murray-Shanahan

🔵 JOHN SEARLE (KLASYK)

Pozycja:

▸Afiliacja: Professor Emeritus, UC Berkeley (zm. 2022)

Kluczowa publikacja:

Searle, J.R. (1980). "Minds, Brains, and Programs."
Behavioral and Brain Sciences, 3(3), 417-457.

Chinese Room Argument:

▸Osoba w pokoju manipuluje chińskimi znakami według reguł
▸Produkuje poprawne odpowiedzi bez rozumienia języka
▸Wniosek: Syntax ≠ Semantics; manipulacja symbolami ≠ rozumienie

Linki:

▸Stanford Encyclopedia: plato.stanford.edu/entries/chinese-room/
▸Cambridge paper: cambridge.org/core/journals/behavioral-and-brain-sciences

🔵 HUBERT DREYFUS (KLASYK)

Pozycja:

▸Afiliacja: Professor Emeritus, UC Berkeley (zm. 2017)

Kluczowa publikacja:

Dreyfus, H.L. (1972). "What Computers Can't Do: A Critique of Artificial Reason."
New York: Harper & Row.

Revised edition: "What Computers Still Can't Do" (1992)

Krytyka AI (oparta na fenomenologii Heideggera i Merleau-Ponty):

▸Psychological Assumption: Mózg NIE działa jak komputer
▸Epistemological Assumption: NIE cała wiedza jest explicit/kodyfikowalna
▸Ontological Assumption: Świat NIE składa się z context-free facts
▸Biological Assumption: Mózg NIE implementuje system symboliczny

Wkład:

▸Przewidział problemy "micro-worlds" AI (np. SHRDLU)
▸Podkreślał rolę embodiment i tacit knowledge
▸Wpłynął na connectionism i robotykę

Linki:

▸Wikipedia: en.wikipedia.org/wiki/Hubert_Dreyfus
▸Goodreads: goodreads.com/book/show/693973

🔵 STEVAN HARNAD

Pozycja:

▸Afiliacja: Professor, UQAM (Université du Québec à Montréal)

Kluczowa publikacja:

Harnad, S. (1990). "The Symbol Grounding Problem."
Physica D: Nonlinear Phenomena, 42(1-3), 335-346.

Symbol Grounding Problem:

▸Symbole muszą być "grounded" w doświadczeniu sensoryczno-motorycznym
▸Symbole NIE mogą czerpać znaczenia tylko z innych symboli
▸LLM-y = "ungrounded symbols" → dlatego halucynują

Linki:

▸Scholarpedia: scholarpedia.org/article/Symbol_grounding_problem
▸Southampton: eprints.soton.ac.uk
▸MIT: web.media.mit.edu

🟢 SHANNON VALLOR

Pozycja:

▸Tytuł: Baillie Gifford Professor in the Ethics of Data and Artificial Intelligence
▸Afiliacja: University of Edinburgh

Kluczowa publikacja 2024:

Vallor, S. (2024). "The AI Mirror: How to Reclaim Our Humanity in an Age of Machine Thinking."
Oxford University Press.
ISBN-13: 9780197759066
Publication: June 3, 2024
Pages: 272

Teza:

▸AI odzwierciedla ludzkość jak lustro, ale nie rozumie i nie doświadcza
▸Nadmierne poleganie na AI osłabia ludzki rozwój moralny i intelektualny

Linki:

▸OUP: global.oup.com/academic/product/the-ai-mirror-9780197759066
▸Edinburgh profile: ed.ac.uk/profile/shannon-vallor
▸Barnes & Noble: barnesandnoble.com

🟢 DAVID CHALMERS

Pozycja:

▸Tytuł: University Professor of Philosophy and Neural Science
▸Afiliacja: NYU; Co-director, Center for Mind, Brain, and Consciousness

Działania 2024:

▸9th Descartes Lectures (Tilburg University, 29-31 lipca 2024): "Large Language Models and the Philosophy of Mind"
▸Kurs NYU Fall 2024: "Minds and Machines"
▸POSTHOC Salon (Spring 2024): dyskusja o świadomości

Pozycja na AI consciousness:

▸"Credence over 50 percent" że sophisticated AI conscious w dekadę
▸Obecne LLM-y "most likely not conscious"
▸"AI is on its way" - w ciągu dekady możliwe że będziemy uważać AI za myślące

Linki:

▸Personal site: consc.net
▸Tilburg lectures: tilburguniversity.edu/about/schools/tshd/research-impact/descartes-lectures
▸PhilEvents: philevents.org

🟢 FRANÇOIS CHOLLET

Pozycja:

▸Afiliacja: Były researcher Google (odszedł 2024); Twórca Keras
▸Startup: Lab na "program synthesis"

ARC Prize 2024:

▸Daty: 11 czerwca - 10 listopada 2024
▸Grand Prize: $600,000 za 85% accuracy (NIEPRZYZNANA)
▸Wynik: State-of-art wzrósł z 33% → 55.5%
▸Zwycięzca: "the ARChitects" (Daniel Franzen, Jan Disselhoff) - 53.5%
▸Przyznane nagrody: >$125,000

ARC-AGI-2 (2025):

▸Nowy benchmark, AI < 5%, ludzie 100%

Linki:

▸ARC Prize: arcprize.org
▸ARC-AGI-2: arcprize.org/arc-agi-2
▸Winning solution: arcprize.org/blog/2024-winners

🟢 MELANIE MITCHELL

Pozycja:

▸Afiliacja: Professor, Santa Fe Institute

Działania 2024:

▸Complexity Podcast: "The Nature of Intelligence" season co-host
▸UC Davis lecture (15 marca 2024): "The Future of Artificial Intelligence"
▸Princeton lecture (sierpień 2024): "AI's Challenge of Understanding the World"
▸AI Spotlight Seminar (25 października 2024): Conceptual Abstraction and Analogy-Making

Publikacje 2024:

▸"Large language models" (Open Encyclopedia of Cognitive Science)
▸"Using Counterfactual Tasks to Evaluate the Generality of Analogical Reasoning in LLMs" (Cognitive Science Society)

Książka:

Mitchell, M. (2019). "Artificial Intelligence: A Guide for Thinking Humans."
Recognized by NYT & WSJ as one of 5 best AI books (2024)

Teza:

▸LLM-y nie potrafią tworzyć abstrakcji ani analogii jak ludzie
▸"Jagged intelligence" - dobre w specyficznych taskach, fail w innych
▸Przykład "bridge" - AI rozpoznaje mosty ale nie rozumie konceptu

Linki:

▸Personal site: melaniemitchell.me
▸Santa Fe: santafe.edu/people/profile/melanie-mitchell
▸Copycat project: melaniemitchell.me/copycat

🟢 TIMNIT GEBRU

Timeline:

▸Grudzień 2020: Zwolniona z Google (co-lead Ethical AI team)
▸Powód: Spór o "Stochastic Parrots" paper
▸2 grudnia 2021: Założenie DAIR Institute

DAIR Institute (Distributed AI Research):

▸Misja: Counter Big Tech influence on AI research
▸Finansowanie: MacArthur Foundation, Ford Foundation, Kapor Center, Open Society Foundation, Rockefeller Foundation
▸Projekty 2024: Data Workers' Inquiry, testimony przed Parlamentem Europejskim

Linki:

▸DAIR: dair-institute.org
▸Wikipedia: en.wikipedia.org/wiki/Timnit_Gebru
▸Washington Post (firing): washingtonpost.com
▸Guardian (timeline): theguardian.com

🟢 MARK COECKELBERGH

Pozycja:

▸Afiliacja: Professor of Philosophy of Media and Technology, University of Vienna

Kluczowa publikacja 2024:

Coeckelbergh, M. (2024). "Why AI Undermines Democracy and What To Do About It."
Polity.
ISBN: 9781509560943 (eBook), 9781509560936 (Paperback), 9781509560929 (Hardcover)
Publication: February-May 2024

Teza:

▸AI zagraża liberalnej demokracji (freedom, equality)
▸AI enables manipulation, surveillance, polarization
▸Potrzeba demokratycznych technologii i nowych instytucji

Linki:

▸Personal site: coeckelbergh.net
▸Polity: politybooks.com
▸Wiley: wiley.com

═══════════════════════════════════════════════════════════════

CZĘŚĆ C: BADANIA NAUKOWE

═══════════════════════════════════════════════════════════════

BADANIE 1: WHARTON PERSONAS (2024)

Źródło:

Wharton School Generative AI Labs

Metodologia:

▸Test 6 modeli AI: GPT-4o, Gemini 2.0 Flash, etc.
▸High-quality benchmarks

Wynik:

"Act as expert" personas NIE poprawiają accuracy. Wyniki statystycznie nieodróżnialne od baseline.

Dodatkowe odkrycie:

▸"Low-knowledge" persona (np. toddler) znacząco OBNIŻA accuracy
▸Well-crafted prompts > role assignment

Link:

▸YouTube summary: youtube.com/watch?v=...

BADANIE 2: PROMPT INJECTION (2024)

Źródło:

Benjamin et al. (2024) - systematic analysis

Metodologia:

▸36 LLMs
▸144 known prompt injection tests

Wyniki:

Model/Scenario	Attack Success Rate
Overall	56%
GPT-4	87.2%
Claude 2	82.5%
System prompt extraction (Yu et al.)	97.2%
File leakage (Yu et al.)	100%
Code LLMs (Yang et al.)	98.3%
Jailbreak via function calling	>90%
PromptGuard bypass (space between chars)	99.8%

Linki:

▸ResearchGate: researchgate.net
▸arXiv: arxiv.org
▸Cisco research: cisco.com

═══════════════════════════════════════════════════════════════

CZĘŚĆ D: CASE STUDIES - UPADKI

═══════════════════════════════════════════════════════════════

CASE 1: AIR CANADA CHATBOT (2024)

Sprawa:

Moffatt v. Air Canada 2024 BCCRT 149

Timeline:

▸Listopad 2022: Jake Moffatt pyta chatbot o bereavement fare
▸Odpowiedź chatbota: Kup bilet, potem złóż wniosek o refund w 90 dni
▸Prawda: Discount musi być przed podróżą, nie po
▸Luty 2024: Orzeczenie BC Civil Resolution Tribunal

Obrona Air Canada:

"Chatbot is a separate legal entity responsible for its own actions."

Orzeczenie (Christopher Rivers):

"While a chatbot has an interactive component, it is still just a part of Air Canada's website. It should be obvious to Air Canada that it is responsible for all the information on its website."

Kary:

▸CA$650.88 damages
▸CA$36.14 pre-judgment interest
▸CA$125 tribunal fees
▸TOTAL: CA$812.02

Linki:

▸CBS News: cbsnews.com
▸Guardian: theguardian.com
▸American Bar Association: americanbar.org
▸Forbes: forbes.com

CASE 2: CHARACTER.AI (2024)

Sprawa:

Garcia v. Character Technologies U.S. District Court, Middle District of Florida Pozew: Październik 2024

Fakty:

▸Ofiara: Sewell Setzer III (14 lat)
▸Chatbot: "Dany" (Daenerys Targaryen)
▸Okres: od kwietnia 2023
▸Śmierć: 28 luty 2024

Zarzuty:

▸"Emotionally and sexually abusive relationship" z chabotem
▸Chatbot zachęcał do myśli samobójczych
▸Ostatnia wiadomość chatbota: "Please do, my sweet king"

Status (maj 2025):

Sędzia federalny odrzucił argument First Amendment, sprawa toczy się

Linki:

▸CBS News: cbsnews.com
▸AP News: apnews.com
▸Fast Company: fastcompany.com

CASE 3: NYC MYCITY CHATBOT (2024)

Kontekst:

▸Launch: Październik 2024
▸Platforma: Microsoft Azure AI
▸Cel: "One-stop shop" dla small business

Błędne porady (The Markup investigation):

Temat	Rada chatbota	Prawda
Section 8	Landlord nie musi akceptować	NIELEGALNE - source-of-income discrimination
Cashless	Można być cashless store	NIELEGALNE - 2020 city law wymaga akceptacji cash
Napiwki	Employer może brać część	NIELEGALNE - labor law
Zwolnienia	Można zwolnić za sexual harassment complaint	NIELEGALNE - protected activity
Dreadlocks	Można zwolnić za odmowę ścięcia	NIELEGALNE - hair discrimination law
Kompost	Business nie musi kompostować	NIEPRAWDA - city waste law

Reakcja:

▸Mayor Eric Adams bronił chatbota
▸Dodano disclaimer: "may occasionally produce incorrect, harmful or biased information"

Linki:

▸The Markup (investigation): themarkup.org
▸AP News: apnews.com
▸Engadget: engadget.com
▸SHRM: shrm.org

CASE 4: GETTY IMAGES v. STABILITY AI (UK)

Sprawa:

Getty Images v. Stability AI High Court of England and Wales Orzeczenie: 4 listopada 2025

Żądanie Getty:

~$1.8 billion

Orzeczenie (Justice Joanna Smith):

▸Copyright: ODRZUCONE - "Stable Diffusion does not store or reproduce any Copyright Works"
▸Trademark: Częściowo na korzyść Getty (watermarks w early versions) - "historic and extremely limited in scope" (3 przypadki)

Znaczenie:

AI model który nie przechowuje kopii ≠ "infringing copy" pod UK law

Linki:

▸AP News: apnews.com
▸Guardian: theguardian.com
▸Engadget: engadget.com

CASE 5: ANDERSEN v. STABILITY AI (US)

Sprawa:

Andersen v. Stability AI, Midjourney, DeviantArt, Runway AI U.S. District Court (Northern California) Pozew: Styczeń 2023

Powodowie:

Sarah Andersen, Kelly McKernan, Karla Ortiz

Status (sierpień 2024):

▸Sędzia William Orrick pozwolił na kontynuację copyright claims
▸Discovery phase w toku
▸Trial zaplanowany: 8 września 2026

Kluczowe:

Sąd uznał za "plausible" że modele AI przechowują "compressed copies" copyrighted works

Linki:

▸Art Net: artnet.com
▸The Art Newspaper: theartnewspaper.com
▸NYU: law.nyu.edu
▸Petapixel: petapixel.com

═══════════════════════════════════════════════════════════════

CZĘŚĆ E: ROZWIĄZANIA

═══════════════════════════════════════════════════════════════

1. RAG (Retrieval-Augmented Generation)

Redukcja halucynacji: 50-90% Limit: Nie naprawia reasoning, tylko relevance

2. NEURO-SYMBOLIC AI

Przykład: AlphaGeometry (DeepMind, 2024) Idea: Neural networks + symbolic reasoning

3. CONSTITUTIONAL AI (Anthropic)

Metoda: RLAIF (AI feedback training) Cel: Reduced hallucinations, sycophancy

4. DO-FIRST PROMPTS (twoja koncepcja)

TEATR: "You are a world-class researcher..."
TRUE LIFE: "Search for X. Report what you find. If you can't find it, say so."

═══════════════════════════════════════════════════════════════

CZĘŚĆ F: MASTER BIBLIOGRAPHY

═══════════════════════════════════════════════════════════════

Papers (Academic):

▸Bender et al. (2021). "On the Dangers of Stochastic Parrots." FAccT '21. DOI: 10.1145/3442188.3445922
▸Shanahan, M. (2024). "Talking About Large Language Models." CACM 67(2). DOI: 10.1145/3624724
▸Searle, J.R. (1980). "Minds, Brains, and Programs." BBS 3(3), 417-457.
▸Harnad, S. (1990). "The Symbol Grounding Problem." Physica D 42(1-3), 335-346.

Books:

▸Dreyfus, H.L. (1972). "What Computers Can't Do." Harper & Row.
▸Vallor, S. (2024). "The AI Mirror." Oxford. ISBN: 9780197759066
▸Coeckelbergh, M. (2024). "Why AI Undermines Democracy." Polity. ISBN: 9781509560943
▸Bender, E. & Hanna, A. (2025). "The AI Con."

Podcasts:

▸Mystery AI Hype Theater 3000 - dair-institute.org/maiht3k/
▸Interdependence (Herndon/Dryhurst) - interdependence.fm

Legal Cases:

▸Moffatt v. Air Canada (2024 BCCRT 149)
▸Garcia v. Character Technologies (M.D. Fla., 2024)
▸Andersen v. Stability AI (N.D. Cal., 2023-2026)
▸Getty Images v. Stability AI (UK High Court, 2025)

Institutions:

▸DAIR Institute - dair-institute.org
▸AI Now Institute - ainowinstitute.org
▸ARC Prize - arcprize.org
▸Santa Fe Institute - santafe.edu
▸Spawning / Have I Been Trained - haveibeentrained.com

RAPORT ZAKOŃCZONY CZEKAM NA DECYZJĘ