urban-dandelion

Distilling the Knowledge in a Neural Network

카테고리 없음 2026. 5. 6. 21:08

type: papersource: https://arxiv.org/abs/1503.02531Distilling the Knowledge in a Neural Network항목내용저자Geoffrey E. Hinton, O. Vinyals, J. Dean연도2015arXiv1503.02531분야Mathematics, Computer Science인용 수23861 (Semantic Scholar 기준, 작성일 기준)1. 배경 및 문제 정의곤충은 애벌레 단계에서 영양분을 흡수하기에 적합한 형태로 살아가다가, 번데기를 거쳐 비행과 번식에 최적화된 성충으로 변태한다. 이 논문은 동일한 비유가 대규모 머신러닝 모델에도 적용된다고 본다. 즉, 학습 단계와 배포 단계는 본질적으로 다른 요구사항을 가진다.학습 단계: 매우..

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

카테고리 없음 2026. 5. 6. 21:00

type: papersource: https://arxiv.org/abs/2405.04434DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model항목내용저자Zhihong Shao, Damai Dai, Daya Guo, Zihan Wang, Huajian Xin연도2024arXiv2405.04434분야Computer Science인용 수1127 (Semantic Scholar 기준, 작성일 기준)1. 배경 및 문제 정의대규모 언어 모델(LLM, Large Language Model)은 파라미터 수가 늘어날수록 다양한 과제에서 더 강력한 능력을 보이는 경향이 있다. 그러나 이러한 규모 확장은 두 가지 비용을 동반한다..

Tree of Thoughts: Deliberate Problem Solving with Large Language Models

카테고리 없음 2026. 5. 4. 21:35

type: papersource: https://arxiv.org/abs/2305.10601Tree of Thoughts: Deliberate Problem Solving with Large Language Models항목내용저자Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, T. Griffiths, Yuan Cao, Karthik Narasimhan연도2023arXiv2305.10601분야인용 수3700 (Semantic Scholar 기준)1. 배경 및 문제 정의1.1 대규모 언어 모델의 구조적 추론 한계대규모 언어 모델(LM, Language Model)은 본질적으로 토큰 단위 좌→우(autoregressive) 디코딩으로 텍스트를 생성한다. 이는 Kahne..

Reflexion: language agents with verbal reinforcement learning

카테고리 없음 2026. 5. 4. 21:35

type: papersource: https://arxiv.org/abs/2303.11366Reflexion: language agents with verbal reinforcement learning항목내용저자Noah Shinn, Federico Cassano, Beck Labash, A. Gopinath, Karthik Narasimhan, Shunyu Yao연도2023arXiv2303.11366분야인용 수3033 (Semantic Scholar 기준)1. 배경 및 문제 정의기존 LLM 에이전트는 실패로부터 학습하려면 모델 가중치를 업데이트(fine-tuning)해야 했다. 이 방식은 세 가지 근본적 제약을 안고 있다. 첫째, 대규모 학습 데이터셋이 필요하여 비용이 크다. 둘째, 모델 가중치에 직접 ..

Chain-of-Verification Reduces Hallucination in Large Language Models

카테고리 없음 2026. 5. 4. 21:33

type: papersource: https://arxiv.org/abs/2309.11495Chain-of-Verification Reduces Hallucination in Large Language Models항목내용저자Shehzaad Dhuliawala, Mojtaba Komeili, Jing Xu, Roberta Raileanu, Xian Li, Asli Celikyilmaz, Jason Weston연도2023arXiv2309.11495분야인용 수0 (Semantic Scholar 기준)1. 배경 및 문제 정의대규모 언어 모델(LLM)은 사실과 다른 내용을 자신 있게 생성하는 환각(hallucination) 문제를 갖는다. 환각이란 모델이 학습 데이터에 근거하지 않거나 사실과 불일치하는 정보를 그..

Self-Refine: Iterative Refinement with Self-Feedback

카테고리 없음 2026. 5. 4. 21:32

type: papersource: https://arxiv.org/abs/2303.17651Self-Refine: Iterative Refinement with Self-Feedback항목내용저자Aman Madaan, Niket Tandon, Prakhar Gupta, Skyler Hallinan, Luyu Gao, Sarah Wiegreffe, Uri Alon, Nouha Dziri, Shrimai Prabhumoye, Yiming Yang, Shashank Gupta, Bodhisattwa Prasad Majumder, Katherine Hermann, Sean Welleck, Amir Yazdanbakhsh, Peter Clark연도2023arXiv2303.17651분야인용 수0 (Semantic ..

Self-Consistency Improves Chain of Thought Reasoning in Language Models

카테고리 없음 2026. 5. 4. 21:31

type: papersource: https://arxiv.org/abs/2203.11171Self-Consistency Improves Chain of Thought Reasoning in Language Models항목내용저자Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Sharan Narang, Aakanksha Chowdhery, Denny Zhou연도2022arXiv2203.11171분야인용 수0 (Semantic Scholar 기준)1. 배경 및 문제 정의복잡한 추론 문제에는 정답에 도달하는 경로가 하나만 존재하지 않는다. "사과 3개에서 2개를 먹으면 몇 개 남는가?"라는 문제를 풀 때, 뺄셈으로 접근할 수도 있고 남은 것을 세는 방식..

large-language-models-are-zero-shot-reasoners

카테고리 없음 2026. 5. 4. 21:30

type: papersource: https://arxiv.org/abs/2205.11916Large Language Models are Zero-Shot Reasoners항목내용저자Takeshi Kojima, S. Gu, Machel Reid, Yutaka Matsuo, Yusuke Iwasawa연도2022arXiv2205.11916분야Computer Science인용 수6807 (Semantic Scholar 기준)1. 배경 및 문제 정의1.1 언어 모델의 규모 확장과 프롬프팅 패러다임언어 모델(Language Model)의 목적은 텍스트에 대한 확률 분포를 추정하는 것이다. 모델 파라미터 수가 수백만(2016년대) → 수억(BERT 수준) → 수천억(GPT-3 등) 규모로 확장되면서, 대형 모델에서..

ABOUT ME

urban-dandelion urban-dandelion

티스토리툴바

ABOUT ME

전체 글

티스토리툴바