finlab.data

Core data download module providing historical data for Taiwan and US stock markets.

Use Cases

Download historical data such as stock prices, financial statements, and institutional trading
Filter data by market or industry sector
Search for available data tables and fields
Configure data caching strategies
Limit data download range to save memory

Quick Examples

Basic Usage: Download Data

from finlab import data

# Download closing prices
close = data.get('price:收盤價')

# Download P/E ratio
pe_ratio = data.get('price_earning_ratio:本益比')

# Download monthly revenue
revenue = data.get('monthly_revenue:當月營收')

Search Available Fields

# Search for fields containing "收盤" (close)
data.search('收盤')
# Output: ['price:收盤價', 'etl:不含除權息收盤價', ...]

# Search US stock data
data.search('close', market='us')

Restrict Market Scope

# Only fetch listed (TSE) company data
with data.universe(market='TSE'):
    close = data.get('price:收盤價')

# Only fetch specific industry sectors
with data.universe(category=['水泥工業', '食品工業']):
    close = data.get('price:收盤價')

Detailed Guide

See Data Download Details for: - Complete data download tutorial - Data table structure explanation - Advanced filtering techniques - Error handling methods

Global Configuration

Force Cloud/Local Data

from finlab import data

# Force cloud download (re-download every time)
data.force_cloud_download = True

# Force local cache only (offline environment)
data.use_local_data_only = True

Limit Data Time Range

# Only download data from 2020-2023 (saves memory)
data.truncate_start = '2020-01-01'
data.truncate_end = '2023-12-31'

# All subsequent data.get() calls will use this range
close = data.get('price:收盤價')

Recommended Usage

Development phase: Use truncate_start to limit data range for faster testing
Production backtesting: Remove truncate limits, use full historical data
Low memory: Set truncate_start or use use_local_data_only

API Reference

data.get()

finlab.data.get

get(dataset, save_to_storage=True, force_download=False)

下載歷史資料

請至歷史資料目錄來獲得所有歷史資料的名稱，即可使用此函式來獲取歷史資料。假設 save_to_storage 為 True 則，程式會自動在本地複製一份，以避免重複下載大量數據。

PARAMETER	DESCRIPTION
`dataset`	The name of dataset. TYPE: `str`
`save_to_storage`	Whether to save the dataset to storage for later use. Default is True. The argument will be removed in the future. Please use data.set_storage(FileStorage(use_cache=True)) instead. TYPE: `bool` DEFAULT: `True`
`force_download`	Whether to force download the dataset from cloud. Default is False. TYPE: `bool` DEFAULT: `False`

RETURNS	DESCRIPTION
`DataFrame`	financial data

Examples:

欲下載所有上市上櫃之收盤價歷史資料，只需要使用此函式即可:

from finlab import data
close = data.get('price:收盤價')
close

date	0015	0050	0051	0052	0053
2007-04-23	9.54	57.85	32.83	38.4	nan
2007-04-24	9.54	58.1	32.99	38.65	nan
2007-04-25	9.52	57.6	32.8	38.59	nan
2007-04-26	9.59	57.7	32.8	38.6	nan
2007-04-27	9.55	57.5	32.72	38.4	nan

Note

使用 data.get 時，會預設優先下載近期資料，並與本地資料合併，以避免重複下載大量數據。

假如想要強制下載所有資料，可以在下載資料前，使用

data.force_cloud_download = True

假如想要強制使用本地資料，不額外下載，可以在下載資料前，使用

data.use_local_data_only = True

Common Data Tables

Price Data: - price:收盤價 - Daily closing price - price:開盤價 - Daily opening price - price:最高價 / price:最低價 - Daily high/low - price:成交股數 - Trading volume

Fundamental Data: - price_earning_ratio:本益比 - P/E ratio - price_earning_ratio:股價淨值比 - P/B ratio - fundamental_features:股東權益報酬率 - ROE - financial_statement:每股盈餘 - EPS

Institutional Trading Data: - institutional_investors_trading_summary:投信買賣超股數 - margin_transactions:融資使用率 - etl:外資持股比例

Monthly Revenue: - monthly_revenue:當月營收 - monthly_revenue:去年同月增減(%)

See the full list at the Database Catalog.

Common Errors

KeyError: Data table name is wrong or API token is not set
Empty DataFrame: Query conditions too strict or data does not exist
Out of memory: Too much data downloaded, use truncate_start to limit range

data.search()

finlab.data.search

search(keyword=None, market=None)

搜尋 FinLab 資料庫可用的資料欄位。

PARAMETER	DESCRIPTION
`keyword`	搜尋關鍵字。若為 None 則列出全部。 TYPE: `str` DEFAULT: `None`
`market`	市場選擇 ('tw', 'us', 'hk', 'jp', 'kr', 'uk', 'all')。預設依據 data.set_market() 設定，若未設定則為 'tw'。 TYPE: `str` DEFAULT: `None`

RETURNS	DESCRIPTION
`list`	可用於 data.get() 的資料名稱列表，格式為 "table:column" TYPE: `list`

Examples:

# 列出全部台股資料
tw_data = data.search()

# 搜尋台股包含 '收盤' 的欄位
close_data = data.search('收盤', market='tw')
# ['price:收盤價']

# 搜尋美股包含 'close' 的欄位
us_close = data.search('close', market='us')
# ['us_price:close', 'us_div_adj_price:adj_close', ...]

# 搜尋日股包含 'price' 的欄位
jp_price = data.search('price', market='jp')

# 搜尋所有市場包含 'price' 的欄位
all_price = data.search('price', market='all')

Examples:

# Search for fields containing "股東" (shareholder)
data.search('股東')
# ['fundamental_features:股東權益報酬率', 'internal_equity_pledge:百分之十以上大股東持有股數', ...]

# Search for US stock PE ratio
data.search('pe', market='us')

# List all Taiwan stock fields
all_fields = data.search()
print(f"Total {len(all_fields)} fields")

data.universe()

finlab.data.universe

universe

universe(exchange='ALL', sector='ALL', exclude_sector=None, industry='ALL', asset_type=None, *, market=None, category=None, exclude_category=None)

Context manager to set a global stock universe filter for data retrieval.

Auto-dispatches TW vs international logic based on data._current_market.

Parameters

exchange : str | list[str], default 'ALL' TW: 'TWSE'/'TPEx' (or legacy 'TSE'/'OTC'). International: 'NASDAQ'/'NYSE'/'AMEX'/'HKEX'/'TSE'/etc. sector : str | list[str], default 'ALL' Sector name(s) with regex matching. exclude_sector : str | list[str] | None, default None Sector(s) to exclude (TW only). industry : str | list[str], default 'ALL' Industry filter (international markets). asset_type : str | None, default None TW only: 'ETF' or 'STOCK_FUTURE'. market : str | None Legacy alias for TW exchange+asset_type. category : str | list[str] | None Legacy alias for sector. exclude_category : str | list[str] | None Legacy alias for exclude_sector.

Examples

TW market (default):

from finlab import data with data.universe(exchange=['TWSE', 'TPEx'], sector=['鋼鐵工業', '航運業']): ... close = data.get('price:收盤價')

Legacy TW usage (still works):

with data.universe(market='TSE_OTC', sector='水泥', exclude_sector='ETF'): ... close = data.get('price:收盤價')

US market:

data.set_market('us') with data.universe(exchange='NASDAQ', sector='Technology'): ... close = data.get('price:close')

JP market:

data.set_market('jp') with data.universe(sector='Technology'): ... close = data.get('price:close')

us_universe

us_universe(sector='ALL', industry='ALL', exchange='ALL')

Context manager to set a global stock universe filter for US market data retrieval.

This context manager limits the set of US stocks returned by data functions to a specific sector, industry, and exchange selection. The filter is applied globally within the context and is restored after the context exits.

Parameters

sector : str | list[str], default 'ALL' Sector name(s) to include. Supports regex-like substring matching. industry : str | list[str], default 'ALL' Industry name(s) to include. Supports regex-like substring matching. exchange : str | list[str], default 'ALL' Exchange name(s) to include. Common values: 'NASDAQ', 'NYSE', 'AMEX'.

Examples

from finlab import data with data.us_universe(sector='Technology', exchange='NASDAQ'): ... close = data.get('us_price:close')

set_universe

set_universe(exchange='ALL', sector='ALL', exclude_sector=None, industry='ALL', asset_type=None, *, market=None, category=None, exclude_category=None)

Set global stock universe filter. Auto-dispatches based on current market.

When data.set_market('tw') (or no market set), uses TW logic (security_categories). For any other market (us, hk, jp, kr, uk), loads {market}_company_profile and filters by available columns.

Parameters

exchange : str | list[str], default 'ALL' TW: 'TWSE', 'TPEx' (or legacy 'TSE', 'OTC'). International: 'NASDAQ', 'NYSE', 'AMEX', 'HKEX', 'TSE', etc. sector : str | list[str], default 'ALL' Sector name(s) with regex matching. exclude_sector : str | list[str] | None, default None Sector(s) to exclude (TW only). industry : str | list[str], default 'ALL' Industry filter (international markets). asset_type : str | None, default None TW only: 'ETF' or 'STOCK_FUTURE'. market : str | None Legacy alias for TW exchange+asset_type (e.g. 'TSE_OTC', 'ETF'). category : str | list[str] | None Legacy alias for sector. exclude_category : str | list[str] | None Legacy alias for exclude_sector.

set_us_universe

set_us_universe(sector='ALL', industry='ALL', exchange='ALL')

Set global US stock universe filter.

Thin wrapper around _set_intl_universe_impl for backward compatibility.

Parameters

sector : str | list[str], default 'ALL' Sector filter with regex-like substring matching. industry : str | list[str], default 'ALL' Industry filter with regex-like substring matching. exchange : str | list[str], default 'ALL' Exchange filter (e.g., 'NASDAQ', 'NYSE', 'AMEX').

Examples:

# Example 1: Only listed companies
with data.universe(market='TSE'):
    close = data.get('price:收盤價')
    print(f"Number of listed companies: {len(close.columns)}")

# Example 2: Specific industry sectors
with data.universe(category=['半導體業']):
    close = data.get('price:收盤價')

# Example 3: Top 100 by market cap
with data.universe(size=100):
    close = data.get('price:收盤價')

# Example 4: Combined conditions
with data.universe(market='TSE_OTC', category=['電子工業'], size=50):
    close = data.get('price:收盤價')

Available market parameter values: - 'TSE' - Listed (Taiwan Stock Exchange) - 'OTC' - OTC (Taipei Exchange) - 'TSE_OTC' - Listed + OTC - 'ALL' - All (including Emerging Stock Board)

data.us_universe()

finlab.data.us_universe

us_universe(sector='ALL', industry='ALL', exchange='ALL')

Context manager to set a global stock universe filter for US market data retrieval.

This context manager limits the set of US stocks returned by data functions to a specific sector, industry, and exchange selection. The filter is applied globally within the context and is restored after the context exits.

Parameters

sector : str | list[str], default 'ALL' Sector name(s) to include. Supports regex-like substring matching. industry : str | list[str], default 'ALL' Industry name(s) to include. Supports regex-like substring matching. exchange : str | list[str], default 'ALL' Exchange name(s) to include. Common values: 'NASDAQ', 'NYSE', 'AMEX'.

Examples

from finlab import data with data.us_universe(sector='Technology', exchange='NASDAQ'): ... close = data.get('us_price:close')

US Market Filtering:

# Get S&P 500 constituents
with data.us_universe(index='SPX'):
    close = data.get('price:close')

# Get NASDAQ 100
with data.us_universe(index='NDX'):
    close = data.get('price:close')

data.indicator()

finlab.data.indicator

indicator(indname, adjust_price=False, resample='D', **kwargs)

支援 Talib 和 pandas_ta 上百種技術指標，計算 2000 檔股票、10年的所有資訊。

在使用這個函式前，需要安裝計算技術指標的 Packages

PARAMETER	DESCRIPTION
`indname`	指標名稱，以 TA-Lib 舉例，例如 SMA, STOCH, RSI 等，可以參考 talib 文件。以 Pandas-ta 舉例，例如 supertrend, ssf 等，可以參考 Pandas-ta 文件。 TYPE: `str`
`adjust_price`	是否使用還原股價計算。 TYPE: `bool` DEFAULT: `False`
`resample`	技術指標價格週期，ex: `D` 代表日線, `W` 代表週線, `M` 代表月線。 TYPE: `str` DEFAULT: `'D'`
`market`	市場選擇，ex: `TW_STOCK` 代表台股, `US_STOCK` 代表美股。 TYPE: `str`
`**kwargs`	技術指標的參數設定，TA-Lib 中的 RSI 為例，調整項為計算週期 `timeperiod=14`。 TYPE: `dict` DEFAULT: `{}`

建議使用者可以先參考以下範例，並且搭配 talib官方文件，就可以掌握製作技術指標的方法了。

Technical Indicator Examples:

from finlab import data

# Get MACD indicator
macd = data.indicator('macd', data.get('price:收盤價'))

# Get RSI indicator
rsi = data.indicator('rsi', data.get('price:收盤價'), period=14)

Cache Management

finlab.data.set_storage

set_storage(storage)

設定本地端儲存歷史資料的方式假設使用 data.get 獲取歷史資料則，在預設情況下，程式會自動在本地複製一份，以避免重複下載大量數據。 storage 就是用來儲存歷史資料的接口。我們提供兩種 storage 接口，分別是 finlab.data.CacheStorage (預設) 以及 finlab.data.FileStorage。前者是直接存在記憶體中，後者是存在檔案中。詳情請參考 CacheStorage 和 FileStorage 來獲得更詳細的資訊。在預設情況下，程式會自動使用 finlab.data.FileStorage 並將重複索取之歷史資料存在作業系統預設「暫時資料夾」。

PARAMETER	DESCRIPTION
`storage`	The interface of storage TYPE: `Storage`

Examples:

欲切換成以檔案方式儲存，可以用以下之方式：

from finlab import data
data.set_storage(data.FileStorage())
close = data.get('price:收盤價')

可以在本地端的 ./finlab_db/price#收盤價.pickle 中，看到下載的資料，可以使用 pickle 調閱歷史資料：

import pickle
close = pickle.load(open('finlab_db/price#收盤價.pickle', 'rb'))

finlab.data.CacheStorage

CacheStorage()

將歷史資料儲存於快取中

Examples:

欲切換成以檔案方式儲存，可以用以下之方式：

from finlab import data
data.set_storage(data.CacheStorage())
close = data.get('price:收盤價')

可以直接調閱快取資料：

close = data._storage._cache['price:收盤價']

finlab.data.FileStorage

FileStorage(path=None, use_cache=True)

將歷史資料儲存於檔案中

PARAMETER	DESCRIPTION
`path`	資料儲存的路徑 TYPE: `str` DEFAULT: `None`
`use_cache`	是否額外使用快取，將資料複製一份到記憶體中。 TYPE: `bool` DEFAULT: `True`

Examples:

欲切換成以檔案方式儲存，可以用以下之方式：

from finlab import data
data.set_storage(data.FileStorage())
close = data.get('price:收盤價')

可以在本地端的 ./finlab_db/price#收盤價.pickle 中，看到下載的資料，可以使用 pickle 調閱歷史資料：

import pickle
close = pickle.load(open('finlab_db/price#收盤價.pickle', 'rb'))

diagnose

diagnose(dataset=None)

診斷本地儲存狀態

PARAMETER	DESCRIPTION
`dataset`	指定要檢查的資料集名稱，例如 'price:收盤價'。如果不指定，則列出所有本地資料。 TYPE: `str` DEFAULT: `None`

Examples:

from finlab import data
data._storage.diagnose()  # 列出所有本地資料
data._storage.diagnose('price:收盤價')  # 檢查特定資料集

Custom Cache Strategy:

from finlab.data import set_storage, FileStorage

# Use a custom directory
storage = FileStorage('/path/to/custom/cache')
set_storage(storage)

# All subsequent data will be cached to the specified location
close = data.get('price:收盤價')

Other Utilities

finlab.data.get_strategies

get_strategies(api_token=None)

取得已上傳量化平台的策略回傳資料。

可取得自己策略儀表板上的數據，例如每個策略的報酬率曲線、報酬率統計、夏普率、近期部位、近期換股日...，這些數據可以用來進行多策略彙整的應用喔！

PARAMETER	DESCRIPTION
`api_token`	若未帶入finlab模組的api_token，會自動跳出GUI頁面，複製網頁內的api_token貼至輸入欄位即可。 TYPE: `str` DEFAULT: `None`

Returns: (dict): strategies data Response detail:

``` py
{
  strategy1:{
    'asset_type': '',
    'drawdown_details': {
       '2015-06-04': {
         'End': '2015-11-03',
         'Length': 152,
         'drawdown': -0.19879090089478024
         },
         ...
      },
    'fee_ratio': 0.000475,
    'last_trading_date': '2022-06-10',
    'last_updated': 'Sun, 03 Jul 2022 12:02:27 GMT',
    'ndays_return': {
      '1': -0.01132480035770611,
      '10': -0.0014737286933147464,
      '20': -0.06658015749110646,
      '5': -0.002292995729485159,
      '60': -0.010108700314771735
      },
    'next_trading_date': '2022-06-10',
    'positions': {
      '1413 宏洲': {
        'entry_date': '2022-05-10',
        'entry_price': 10.05,
        'exit_date': '',
        'next_weight': 0.1,
        'return': -0.010945273631840613,
        'status': '買進',
        'weight': 0.1479332345384493
        },
      'last_updated': 'Sun, 03 Jul 2022 12:02:27 GMT',
      'next_trading_date': '2022-06-10',
      'trade_at': 'open',
      'update_date': '2022-06-10'
      },
    'return_table': {
      '2014': {
        'Apr': 0.0,
        'Aug': 0.06315180932606546,
        'Dec': 0.0537589857541485,
        'Feb': 0.0,
        'Jan': 0.0,
        'Jul': 0.02937490104459939,
        'Jun': 0.01367930162104769,
        'Mar': 0.0,
        'May': 0.0,
        'Nov': -0.0014734320286596825,
        'Oct': -0.045082529665408266,
        'Sep': 0.04630906972509852,
        'YTD': 0.16626214846456966
        },
        ...
      },
    'returns': {
      'time': [
        '2014-06-10',
        '2014-06-11',
        '2014-06-12',
        ...
        ],
      'value': [
        100,
        99.9,
        100.2,
        ...
        ]
      },
    'stats': {
      'avg_down_month': -0.03304015302646822,
      'avg_drawdown': -0.0238021414698247,
      'avg_drawdown_days': 19.77952755905512,
      'avg_up_month': 0.05293384465715908,
      'cagr': 0.33236021285588846,
      'calmar': 1.65261094975066,
      'daily_kurt': 4.008888367138843,
      'daily_mean': 0.3090784769257415,
      'daily_sharpe': 1.747909002374217,
      'daily_skew': -0.6966018726321078,
      'daily_sortino': 2.8300677082214034,
      ...
      },
    'tax_ratio': 0.003,
    'trade_at': 'open',
    'update_date': '2022-06-10'
    },
  strategy2:{...},
  ...}
```

FAQ

Q: How do I find out what data is available for download?

Method 1: Use search()

# List all fields
all_fields = data.search()
for field in all_fields[:10]:
    print(field)

Method 2: Check the online database Visit the FinLab Database Catalog to browse the full list.

Q: Data download is slow, what can I do?

# Method 1: Limit time range
data.truncate_start = '2020-01-01'

# Method 2: Use cache (second call will be fast)
close = data.get('price:收盤價')  # First call is slow
close = data.get('price:收盤價')  # Second call is fast (uses cache)

# Method 3: Use universe to limit stock count
with data.universe(size=100):
    close = data.get('price:收盤價')  # Only downloads 100 stocks

Q: KeyError: 'price:收盤價' - what should I do?

Possible causes: 1. Not logged in - Run finlab.login() or finlab.login('YOUR_TOKEN') 2. Incorrect field name - Use data.search('收盤') to verify the correct name 3. Invalid API token - Re-obtain the token

import finlab

# Check if logged in
try:
    token, token_type = finlab.get_token()
    print(f"Logged in ({token_type})")
except:
    print("Not logged in, please run finlab.login()")

Q: How do I download US stock data?

from finlab import data

# US stock closing prices
us_close = data.get('price:close', market='us')

# Search US stock fields
us_fields = data.search(market='us')

Q: What about missing values in the data?

close = data.get('price:收盤價')

# Check missing values
print(f"Missing value ratio: {close.isna().sum().sum() / close.size:.2%}")

# Fill missing values
close_filled = close.fillna(method='ffill')  # Forward fill

# Or drop stocks with too many missing values
close_clean = close.dropna(axis=1, thresh=len(close)*0.8)  # Keep stocks with 80%+ data

Q: How can I save memory?

# Method 1: Limit time range
data.truncate_start = '2020-01-01'

# Method 2: Process in batches
all_stocks = close.columns
for batch in [all_stocks[i:i+100] for i in range(0, len(all_stocks), 100)]:
    batch_close = close[batch]
    # Process 100 stocks at a time...

# Method 3: Only download the fields you need
# Avoid calling data.get() for too many tables at once

Resources

Data Download Detailed Tutorial - Complete usage guide
FinLab Database Catalog - All available data tables
Quick Start Guide - Getting started
FAQ - More troubleshooting