A Scientific Examination of the Fundamental Limitations of Artificial Intelligence and Its Dependence on Human Input

Abstract

Artificial Intelligence (AI) systems are frequently described as autonomous agents capable of independent reasoning and truth discovery. This report challenges that characterization by examining the epistemic and structural limitations of AI systems. It demonstrates that AI remains fundamentally dependent on human-generated inputs for data collection, interpretation, judgment, and truth validation. Using a concrete example in which a dominant dataset encodes a false cosmological belief—namely that the sun revolves around the Earth—this study shows that AI systems cannot independently distinguish truth from consensus. The findings suggest that AI lacks epistemic grounding and cannot function as an autonomous source of knowledge.

1. Introduction

Advances in machine learning have led to increasingly capable AI systems that perform complex analytical and predictive tasks. These developments have fueled claims that AI may soon operate as an independent intelligence. However, such claims often conflate computational capability with epistemic autonomy.

This report evaluates whether AI systems can independently acquire, validate, and correct knowledge. It argues that AI lacks the capacity for independent observation, judgment, and truth-seeking, and therefore remains reliant on human inputs at every critical stage of operation.

2. Defining Autonomy in Artificial Intelligence

2.1 Operational Autonomy

Operational autonomy refers to a system’s ability to execute tasks without continuous human supervision. Many modern AI systems satisfy this criterion.

2.2 Epistemic Autonomy

Epistemic autonomy refers to the ability to independently determine what is true by observing reality, questioning assumptions, and correcting errors. This report focuses on epistemic autonomy and demonstrates that AI systems do not possess it.

3. Human Dependence in Data Generation

AI systems do not originate data. They operate on information that has already been:

Observed by humans
Interpreted by humans
Structured and encoded by humans

Examples include demographic data (age, name, location), behavioral data (visits, purchases), and linguistic data (text, labels). Even automated pipelines depend on prior human decisions about what data should be collected and why. AI therefore operates exclusively on representations of reality rather than reality itself.

4. Web Scraping as Derivative Data Acquisition

Web scraping is frequently cited as evidence that AI can collect data independently. However, scraped data is entirely human-generated.

Web pages and digital records exist because humans:

Observed events
Recorded interpretations
Chose formats and publication contexts

AI scraping systems merely reprocess these representations. They cannot evaluate truth, detect intent, or recognize deception. Thus, scraping does not constitute independent knowledge acquisition.

5. Sensor-Based Inputs and the Limits of Artificial Perception

5.1 Role of Sensors

AI systems can ingest sensory data through cameras, microphones, and other instruments. While this allows AI to ingest raw signals, it does not equate to perception.

5.2 Limits of Sensor Interpretation

Humans determine:

What sensors exist
Where they are placed
What constitutes meaningful signal
How outputs should be interpreted

Sensors capture measurements; they do not supply meaning. AI systems processing sensor data remain constrained by human-defined interpretive frameworks.

6. Absence of Judgment and Value-Based Reasoning

Human judgment involves ethical evaluation, cultural context, and awareness of long-term consequences. AI systems lack intrinsic values and instead optimize predefined objective functions.

As a result, AI does not evaluate correctness or importance in a normative sense. Apparent judgment is a statistical reflection of historical human decisions rather than genuine evaluation.

7. Case Study: Majority Bias and False Cosmological Belief

7.1 Description of the Example

Consider a dataset in which 90% of all inputs repeatedly assert that the sun revolves around the Earth. This dataset may include text, educational materials, historical records, or user-generated content that consistently encodes this belief.

7.2 AI Learning Mechanism

An AI system trained on this dataset learns by identifying statistical regularities and maximizing predictive likelihood. Because the geocentric claim appears with overwhelming frequency, the system will infer that it is correct.

The AI has no independent mechanism to:

Observe astronomical motion
Conduct experiments
Detect physical contradictions
Privilege minority evidence over majority consensus

7.3 Consequences

The AI will reproduce the false belief with high confidence, not because it has been “fooled,” but because it faithfully reflects the dominant pattern in its training data. This demonstrates a fundamental limitation: AI does not distinguish truth from popularity; it distinguishes patterns from noise.

8. Human Correction vs. AI Stagnation

Historically, humans corrected the geocentric model through observation, improved instrumentation, hypothesis testing, and a willingness to challenge prevailing consensus. AI systems cannot initiate such corrective processes independently. Without human intervention, false beliefs embedded in training data may persist without effective correction. This becomes especially evident in modern contexts, where AI can generate and disseminate content far more rapidly than humans can verify it. Such dynamics also enable malicious intent, whereby human actors—assisted by AI—can coordinate large-scale propaganda efforts and targeted misinformation campaigns.

9. Lack of Epistemic Grounding

AI systems lack epistemic grounding—the ability to tie beliefs to reality through experience and experimentation. They cannot validate claims beyond the symbolic data they consume.

As a result, AI is vulnerable to systematic misinformation, consensus-driven error, cultural and historical bias, and self-reinforcing falsehoods.

10. Challenges and Rebuttals

My report is open for rebuttals/corrections due to continuing advancement in AI technologies and innovations. Here are several critiques or challenges that may arise:

Autonomous data collection: Critics may argue that AI collects data automatically via sensors or web scraping. Rebuttal: AI cannot independently determine relevance, context, or truth; it only processes human-defined inputs.
Autonomous decision-making: Examples like autonomous vehicles may suggest AI makes independent decisions. Rebuttal: Operational autonomy does not imply epistemic autonomy; all decisions follow human-defined objectives.
Anthropomorphism: Critics may claim AI “understands” or “judges.” Rebuttal: AI reproduces statistical patterns; it does not form values or evaluate meaning independently.
Dataset edge cases: The sun-Earth example may seem contrived. Rebuttal: It is a thought experiment illustrating real-world bias risks, such as misinformation propagation and majority-driven errors.
Rapid AI advances: Critics may cite self-supervised multimodal AI as more “independent.” Rebuttal: These models still lack self-verification, value judgment, or empirical truth correction without human oversight.

11. Implications for Scientific and Societal Use

These limitations imply that AI cannot function as an independent authority on truth. Human oversight is necessary for validation, and AI outputs must be contextualized and challenged. Claims of AI autonomy should be carefully qualified.

12. Conclusion

Artificial Intelligence systems are powerful instruments for processing information, but they remain fundamentally dependent on human input. Whether through manual data entry, web scraping, sensor-based inputs, or large-scale datasets, AI operates downstream of human observation and judgment.

The sun–Earth example illustrates a central limitation: AI reflects what humans collectively say, not what is empirically true. The risk is further intensified when distortions of truth are not accidental but intentional. When governmental or institutional actors possess both the authority to shape narratives and the technological capacity to deploy AI at scale, the potential for coordinated propaganda, targeted misinformation, and the systematic bending of empirical reality increases substantially. In such cases, AI does not merely reflect collective belief; it can become an accelerant for entrenched falsehoods.

AI detects patterns. Humans detect reality.

Written in (2025) by Seng Thao, System Janitor at www.thugon.com. All rights reserved. I documented this report for training purposes mainly for IT management, business owners and regulatory agencies. I've designed systems applications since the late 1998s and evaluated AI technologies since the early 2020s. Earlier language models in database and data analysis were in the early 2000s.