๐Ÿฆ† "๋ณต์žกํ•œ ์„ค์ •์ด ํ•„์š”์—†๋Š” ๋น…๋ฐ์ดํ„ฐ ๋ถ„์„, ์ด์ œ ๋…ธํŠธ๋ถ ํ•œ ๋Œ€๋กœ?" - DuckDB(In-process DB) ์ฐจ๋ณ„์„ฑ

DuckDB๋Š” ์ตœ๊ทผ ๋ฐ์ดํ„ฐ ๋ถ„์„ ๋ฐ ์—”์ง€๋‹ˆ์–ด๋ง ์—…๊ณ„์—์„œ ๋†’์€ ๊ด€์‹ฌ์„ ๋ฐ›๊ณ  ์žˆ๋Š” ์ธํ”„๋กœ์„ธ์Šค(In-process) ๋ถ„์„์šฉ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค๋กœ, ๊ธฐ์กด์˜ SQLite๊ฐ€ ํŠธ๋žœ์žญ์…˜ ์ฒ˜๋ฆฌ(OLTP)์— ํŠนํ™”๋˜์–ด ์žˆ๋‹ค๋ฉด, DuckDB๋Š” **๋ฐ์ดํ„ฐ ๋ถ„์„(OLAP)**์— ์ตœ์ ํ™”๋œ '๋ถ„์„ํŒ SQLiteโ€™์ž…๋‹ˆ๋‹ค.

1. ์ธํ”„๋กœ์„ธ์Šค(In-process) ๊ตฌ์กฐ

๋Œ€๋ถ€๋ถ„์˜ DB(PostgreSQL, MySQL ๋“ฑ)๋Š” ๋ณ„๋„์˜ ์„œ๋ฒ„ ํ”„๋กœ์„ธ์Šค๋ฅผ ๋„์šฐ๊ณ  ๋„คํŠธ์›Œํฌ๋ฅผ ํ†ตํ•ด ํ†ต์‹ ํ•ด์•ผ ํ•˜์ง€๋งŒ DuckDB๋Š” ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜ ๋‚ด๋ถ€์— ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ํ˜•ํƒœ๋กœ ํฌํ•จํ•ด์„œ ์ž‘๋™์ด ๋˜๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

  • ์ฐจ์ด์ : ์„œ๋ฒ„ ์„ค์น˜๋‚˜ ๋ณต์žกํ•œ ์„ค์ •์ด ํ•„์š” ์—†์œผ๋ฉฐ, ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜๊ณผ DB ๊ฐ„์˜ ๋ฐ์ดํ„ฐ ์ „์†ก ์˜ค๋ฒ„ํ—ค๋“œ๊ฐ€ ๊ฑฐ์˜ ์—†์œผ๋ฉฐ, Python์ด๋‚˜ R ํ™˜๊ฒฝ์—์„œ ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ ํ˜ธ์ถœํ•˜๋“ฏ ๋ฐ”๋กœ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

2. ์ปฌ๋Ÿผ ์ง€ํ–ฅ ์ €์žฅ ๋ฐ ๋ฒกํ„ฐํ™” ์‹คํ–‰

์ผ๋ฐ˜์ ์ธ ์šด์˜์šฉ DB๋Š” ํ–‰(Row) ๋‹จ์œ„๋กœ ๋ฐ์ดํ„ฐ๋ฅผ ์ฝ์ง€๋งŒ, DuckDB๋Š” ์—ด(Column) ๋‹จ์œ„๋กœ ์ฝ์Šต๋‹ˆ๋‹ค.

  • ์ฐจ์ด์ : ํŠน์ • ์—ด์˜ ํ•ฉ๊ณ„๋‚˜ ํ‰๊ท ์„ ๊ตฌํ•˜๋Š” ๋ถ„์„ ์ฟผ๋ฆฌ์—์„œ ์ƒ๋‹นํžˆ ๋น ๋ฆ…๋‹ˆ๋‹ค. ๋˜ํ•œ ๋ฒกํ„ฐํ™” ์ฟผ๋ฆฌ ์‹คํ–‰(Vectorized Query Execution) ์—”์ง„์„ ์‚ฌ์šฉํ•˜์—ฌ ํ˜„๋Œ€์ ์ธ CPU์˜ ๊ธฐ๋Šฅ์„ ์ตœ๋Œ€ํ•œ ํ™œ์šฉํ•ฉ๋‹ˆ๋‹ค.

3. โ€˜๋ฐ์ดํ„ฐ ์ ‘์ฐฉ์ œโ€™ ์—ญํ•  (Data Glue)

DuckDB๋Š” ๋‹จ์ˆœํžˆ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜๋Š” ๊ณณ์„ ๋„˜์–ด, ์™ธ๋ถ€ ๋ฐ์ดํ„ฐ๋ฅผ ์ง์ ‘ ์ฟผ๋ฆฌํ•˜๋Š” ๋Šฅ๋ ฅ์ด ๋›ฐ์–ด๋‚ฉ๋‹ˆ๋‹ค.

  • ํŒŒ์ผ ์ง์ ‘ ์ฟผ๋ฆฌ: CSV, Parquet, JSON ํŒŒ์ผ์„ DB๋กœ ๋กœ๋“œํ•˜์ง€ ์•Š๊ณ ๋„ SQL๋กœ ์ง์ ‘ ์กฐํšŒํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ์ƒํ˜ธ์šด์šฉ์„ฑ: Pandas, Polars, Arrow ๋ฐ์ดํ„ฐ ํ”„๋ ˆ์ž„๊ณผ ๋ฉ”๋ชจ๋ฆฌ๋ฅผ ๊ณต์œ (Zero-copy)ํ•˜๋ฉฐ ๋ฐ์ดํ„ฐ๋ฅผ ์ฃผ๊ณ ๋ฐ›์„ ์ˆ˜ ์žˆ์–ด, ๋ฐ์ดํ„ฐ ๋ถ„์„ ํŒŒ์ดํ”„๋ผ์ธ์˜ ํ•ต์‹ฌ ์—ฐ๊ฒฐ ๊ณ ๋ฆฌ ์—ญํ• ์„ ํ•ฉ๋‹ˆ๋‹ค.
๊ตฌ๋ถ„ SQLite DuckDB PostgreSQL
์ฃผ ์šฉ๋„ ๋‹จ์ˆœ ์ €์žฅ, ํŠธ๋žœ์žญ์…˜ ๋ฐ์ดํ„ฐ ๋ถ„์„ (OLAP) ๋ฒ”์šฉ ์„œ๋ฒ„ํ˜• DB
์„ค์น˜ ๋ถˆํ•„์š” (ํŒŒ์ผ ๊ธฐ๋ฐ˜) ๋ถˆํ•„์š” (ํŒŒ์ผ ๊ธฐ๋ฐ˜) ํ•„์š” (์„œ๋ฒ„ํ˜•)
๋ฐ์ดํ„ฐ ๊ตฌ์กฐ ํ–‰(Row) ๊ธฐ๋ฐ˜ ์—ด(Column) ๊ธฐ๋ฐ˜ ํ–‰ ๊ธฐ๋ฐ˜ (๊ธฐ๋ณธ)
์„ฑ๋Šฅ ์“ฐ๊ธฐ/์ˆ˜์ •์— ๊ฐ•ํ•จ ๋Œ€๋Ÿ‰ ๋ถ„์„์— ์ตœ์ ํ™” ๋™์‹œ ์ ‘์† ๋ฐ ํ™•์žฅ์„ฑ

| This is a space where knowledge is not merely consumed, but respected, sovereign, and connectedโ€”shared together with cloud industry professionals (Bros).|
| ์ง€์‹์ด ์†Œ๋น„๋˜์ง€ ์•Š๊ณ  ์กด์ค‘ยท์ฃผ๊ถŒ๋ณด์žฅยท์—ฐ๊ฒฐ๋˜๋Š” ๊ณต๊ฐ„์œผ๋กœ ํด๋ผ์šฐ๋“œ ํ˜„์—… ์ „๋ฌธ๊ฐ€(Bro)์™€ ํ•จ๊ป˜ ๊ณต์œ ํ•˜๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค. |

1 Like