finlab.data
Core data download module providing historical data for Taiwan and US stock markets.
Use Cases
- Download historical data such as stock prices, financial statements, and institutional trading
- Filter data by market or industry sector
- Search for available data tables and fields
- Configure data caching strategies
- Limit data download range to save memory
Quick Examples
Basic Usage: Download Data
from finlab import data
# Download closing prices
close = data.get('price:收盤價')
# Download P/E ratio
pe_ratio = data.get('price_earning_ratio:本益比')
# Download monthly revenue
revenue = data.get('monthly_revenue:當月營收')
Search Available Fields
# Search for fields containing "收盤" (close)
data.search('收盤')
# Output: ['price:收盤價', 'etl:不含除權息收盤價', ...]
# Search US stock data
data.search('close', market='us')
Restrict Market Scope
# Only fetch listed (TSE) company data
with data.universe(market='TSE'):
close = data.get('price:收盤價')
# Only fetch specific industry sectors
with data.universe(category=['水泥工業', '食品工業']):
close = data.get('price:收盤價')
Detailed Guide
See Data Download Details for: - Complete data download tutorial - Data table structure explanation - Advanced filtering techniques - Error handling methods
Global Configuration
Force Cloud/Local Data
from finlab import data
# Force cloud download (re-download every time)
data.force_cloud_download = True
# Force local cache only (offline environment)
data.use_local_data_only = True
Limit Data Time Range
# Only download data from 2020-2023 (saves memory)
data.truncate_start = '2020-01-01'
data.truncate_end = '2023-12-31'
# All subsequent data.get() calls will use this range
close = data.get('price:收盤價')
Recommended Usage
- Development phase: Use
truncate_startto limit data range for faster testing - Production backtesting: Remove truncate limits, use full historical data
- Low memory: Set
truncate_startor useuse_local_data_only
API Reference
data.get()
finlab.data.get
下載歷史資料
請至歷史資料目錄 來獲得所有歷史資料的名稱,即可使用此函式來獲取歷史資料。
假設 save_to_storage 為 True 則,程式會自動在本地複製一份,以避免重複下載大量數據。
| PARAMETER | DESCRIPTION |
|---|---|
dataset
|
The name of dataset.
TYPE:
|
save_to_storage
|
Whether to save the dataset to storage for later use. Default is True. The argument will be removed in the future. Please use data.set_storage(FileStorage(use_cache=True)) instead.
TYPE:
|
force_download
|
Whether to force download the dataset from cloud. Default is False.
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
DataFrame
|
financial data |
Examples:
欲下載所有上市上櫃之收盤價歷史資料,只需要使用此函式即可:
| date | 0015 | 0050 | 0051 | 0052 | 0053 |
|---|---|---|---|---|---|
| 2007-04-23 | 9.54 | 57.85 | 32.83 | 38.4 | nan |
| 2007-04-24 | 9.54 | 58.1 | 32.99 | 38.65 | nan |
| 2007-04-25 | 9.52 | 57.6 | 32.8 | 38.59 | nan |
| 2007-04-26 | 9.59 | 57.7 | 32.8 | 38.6 | nan |
| 2007-04-27 | 9.55 | 57.5 | 32.72 | 38.4 | nan |
Common Data Tables
Price Data:
- price:收盤價 - Daily closing price
- price:開盤價 - Daily opening price
- price:最高價 / price:最低價 - Daily high/low
- price:成交股數 - Trading volume
Fundamental Data:
- price_earning_ratio:本益比 - P/E ratio
- price_earning_ratio:股價淨值比 - P/B ratio
- fundamental_features:股東權益報酬率 - ROE
- financial_statement:每股盈餘 - EPS
Institutional Trading Data:
- institutional_investors_trading_summary:投信買賣超股數
- margin_transactions:融資使用率
- etl:外資持股比例
Monthly Revenue:
- monthly_revenue:當月營收
- monthly_revenue:去年同月增減(%)
See the full list at the Database Catalog.
Common Errors
- KeyError: Data table name is wrong or API token is not set
- Empty DataFrame: Query conditions too strict or data does not exist
- Out of memory: Too much data downloaded, use
truncate_startto limit range
data.search()
finlab.data.search
搜尋 FinLab 資料庫可用的資料欄位。
| PARAMETER | DESCRIPTION |
|---|---|
keyword
|
搜尋關鍵字。若為 None 則列出全部。
TYPE:
|
market
|
市場選擇 ('tw', 'us', 'hk', 'jp', 'kr', 'uk', 'all')。 預設依據 data.set_market() 設定,若未設定則為 'tw'。
TYPE:
|
| RETURNS | DESCRIPTION |
|---|---|
list
|
可用於 data.get() 的資料名稱列表,格式為 "table:column"
TYPE:
|
Examples:
# 列出全部台股資料
tw_data = data.search()
# 搜尋台股包含 '收盤' 的欄位
close_data = data.search('收盤', market='tw')
# ['price:收盤價']
# 搜尋美股包含 'close' 的欄位
us_close = data.search('close', market='us')
# ['us_price:close', 'us_div_adj_price:adj_close', ...]
# 搜尋日股包含 'price' 的欄位
jp_price = data.search('price', market='jp')
# 搜尋所有市場包含 'price' 的欄位
all_price = data.search('price', market='all')
Examples:
# Search for fields containing "股東" (shareholder)
data.search('股東')
# ['fundamental_features:股東權益報酬率', 'internal_equity_pledge:百分之十以上大股東持有股數', ...]
# Search for US stock PE ratio
data.search('pe', market='us')
# List all Taiwan stock fields
all_fields = data.search()
print(f"Total {len(all_fields)} fields")
data.universe()
finlab.data.universe
universe
universe(exchange='ALL', sector='ALL', exclude_sector=None, industry='ALL', asset_type=None, *, market=None, category=None, exclude_category=None)
Context manager to set a global stock universe filter for data retrieval.
Auto-dispatches TW vs international logic based on data._current_market.
Parameters
exchange : str | list[str], default 'ALL'
TW: 'TWSE'/'TPEx' (or legacy 'TSE'/'OTC').
International: 'NASDAQ'/'NYSE'/'AMEX'/'HKEX'/'TSE'/etc.
sector : str | list[str], default 'ALL'
Sector name(s) with regex matching.
exclude_sector : str | list[str] | None, default None
Sector(s) to exclude (TW only).
industry : str | list[str], default 'ALL'
Industry filter (international markets).
asset_type : str | None, default None
TW only: 'ETF' or 'STOCK_FUTURE'.
market : str | None
Legacy alias for TW exchange+asset_type.
category : str | list[str] | None
Legacy alias for sector.
exclude_category : str | list[str] | None
Legacy alias for exclude_sector.
Examples
TW market (default):
from finlab import data with data.universe(exchange=['TWSE', 'TPEx'], sector=['鋼鐵工業', '航運業']): ... close = data.get('price:收盤價')
Legacy TW usage (still works):
with data.universe(market='TSE_OTC', sector='水泥', exclude_sector='ETF'): ... close = data.get('price:收盤價')
US market:
data.set_market('us') with data.universe(exchange='NASDAQ', sector='Technology'): ... close = data.get('price:close')
JP market:
data.set_market('jp') with data.universe(sector='Technology'): ... close = data.get('price:close')
us_universe
Context manager to set a global stock universe filter for US market data retrieval.
This context manager limits the set of US stocks returned by data functions to a specific sector, industry, and exchange selection. The filter is applied globally within the context and is restored after the context exits.
Parameters
sector : str | list[str], default 'ALL' Sector name(s) to include. Supports regex-like substring matching. industry : str | list[str], default 'ALL' Industry name(s) to include. Supports regex-like substring matching. exchange : str | list[str], default 'ALL' Exchange name(s) to include. Common values: 'NASDAQ', 'NYSE', 'AMEX'.
Examples
from finlab import data with data.us_universe(sector='Technology', exchange='NASDAQ'): ... close = data.get('us_price:close')
set_universe
set_universe(exchange='ALL', sector='ALL', exclude_sector=None, industry='ALL', asset_type=None, *, market=None, category=None, exclude_category=None)
Set global stock universe filter. Auto-dispatches based on current market.
When data.set_market('tw') (or no market set), uses TW logic
(security_categories). For any other market (us, hk, jp, kr, uk),
loads {market}_company_profile and filters by available columns.
Parameters
exchange : str | list[str], default 'ALL'
TW: 'TWSE', 'TPEx' (or legacy 'TSE', 'OTC').
International: 'NASDAQ', 'NYSE', 'AMEX', 'HKEX', 'TSE', etc.
sector : str | list[str], default 'ALL'
Sector name(s) with regex matching.
exclude_sector : str | list[str] | None, default None
Sector(s) to exclude (TW only).
industry : str | list[str], default 'ALL'
Industry filter (international markets).
asset_type : str | None, default None
TW only: 'ETF' or 'STOCK_FUTURE'.
market : str | None
Legacy alias for TW exchange+asset_type (e.g. 'TSE_OTC', 'ETF').
category : str | list[str] | None
Legacy alias for sector.
exclude_category : str | list[str] | None
Legacy alias for exclude_sector.
set_us_universe
Set global US stock universe filter.
Thin wrapper around _set_intl_universe_impl for backward compatibility.
Parameters
sector : str | list[str], default 'ALL' Sector filter with regex-like substring matching. industry : str | list[str], default 'ALL' Industry filter with regex-like substring matching. exchange : str | list[str], default 'ALL' Exchange filter (e.g., 'NASDAQ', 'NYSE', 'AMEX').
Examples:
# Example 1: Only listed companies
with data.universe(market='TSE'):
close = data.get('price:收盤價')
print(f"Number of listed companies: {len(close.columns)}")
# Example 2: Specific industry sectors
with data.universe(category=['半導體業']):
close = data.get('price:收盤價')
# Example 3: Top 100 by market cap
with data.universe(size=100):
close = data.get('price:收盤價')
# Example 4: Combined conditions
with data.universe(market='TSE_OTC', category=['電子工業'], size=50):
close = data.get('price:收盤價')
Available market parameter values:
- 'TSE' - Listed (Taiwan Stock Exchange)
- 'OTC' - OTC (Taipei Exchange)
- 'TSE_OTC' - Listed + OTC
- 'ALL' - All (including Emerging Stock Board)
data.us_universe()
finlab.data.us_universe
Context manager to set a global stock universe filter for US market data retrieval.
This context manager limits the set of US stocks returned by data functions to a specific sector, industry, and exchange selection. The filter is applied globally within the context and is restored after the context exits.
Parameters
sector : str | list[str], default 'ALL' Sector name(s) to include. Supports regex-like substring matching. industry : str | list[str], default 'ALL' Industry name(s) to include. Supports regex-like substring matching. exchange : str | list[str], default 'ALL' Exchange name(s) to include. Common values: 'NASDAQ', 'NYSE', 'AMEX'.
Examples
from finlab import data with data.us_universe(sector='Technology', exchange='NASDAQ'): ... close = data.get('us_price:close')
US Market Filtering:
# Get S&P 500 constituents
with data.us_universe(index='SPX'):
close = data.get('price:close')
# Get NASDAQ 100
with data.us_universe(index='NDX'):
close = data.get('price:close')
data.indicator()
finlab.data.indicator
支援 Talib 和 pandas_ta 上百種技術指標,計算 2000 檔股票、10年的所有資訊。
在使用這個函式前,需要安裝計算技術指標的 Packages
| PARAMETER | DESCRIPTION |
|---|---|
indname
|
指標名稱, 以 TA-Lib 舉例,例如 SMA, STOCH, RSI 等,可以參考 talib 文件。 以 Pandas-ta 舉例,例如 supertrend, ssf 等,可以參考 Pandas-ta 文件。
TYPE:
|
adjust_price
|
是否使用還原股價計算。
TYPE:
|
resample
|
技術指標價格週期,ex:
TYPE:
|
market
|
市場選擇,ex:
TYPE:
|
**kwargs
|
技術指標的參數設定,TA-Lib 中的 RSI 為例,調整項為計算週期
TYPE:
|
建議使用者可以先參考以下範例,並且搭配 talib官方文件,就可以掌握製作技術指標的方法了。
Technical Indicator Examples:
from finlab import data
# Get MACD indicator
macd = data.indicator('macd', data.get('price:收盤價'))
# Get RSI indicator
rsi = data.indicator('rsi', data.get('price:收盤價'), period=14)
Cache Management
finlab.data.set_storage
設定本地端儲存歷史資料的方式
假設使用 data.get 獲取歷史資料則,在預設情況下,程式會自動在本地複製一份,以避免重複下載大量數據。
storage 就是用來儲存歷史資料的接口。我們提供兩種 storage 接口,分別是 finlab.data.CacheStorage (預設) 以及
finlab.data.FileStorage。前者是直接存在記憶體中,後者是存在檔案中。詳情請參考 CacheStorage 和 FileStorage 來獲得更詳細的資訊。
在預設情況下,程式會自動使用 finlab.data.FileStorage 並將重複索取之歷史資料存在作業系統預設「暫時資料夾」。
| PARAMETER | DESCRIPTION |
|---|---|
storage
|
The interface of storage
TYPE:
|
Examples:
欲切換成以檔案方式儲存,可以用以下之方式:
可以在本地端的 ./finlab_db/price#收盤價.pickle 中,看到下載的資料,
可以使用 pickle 調閱歷史資料:
finlab.data.CacheStorage
finlab.data.FileStorage
將歷史資料儲存於檔案中
| PARAMETER | DESCRIPTION |
|---|---|
path
|
資料儲存的路徑
TYPE:
|
use_cache
|
是否額外使用快取,將資料複製一份到記憶體中。
TYPE:
|
Examples:
欲切換成以檔案方式儲存,可以用以下之方式:
可以在本地端的 ./finlab_db/price#收盤價.pickle 中,看到下載的資料,
可以使用 pickle 調閱歷史資料:
Custom Cache Strategy:
from finlab.data import set_storage, FileStorage
# Use a custom directory
storage = FileStorage('/path/to/custom/cache')
set_storage(storage)
# All subsequent data will be cached to the specified location
close = data.get('price:收盤價')
Other Utilities
finlab.data.get_strategies
取得已上傳量化平台的策略回傳資料。
可取得自己策略儀表板上的數據,例如每個策略的報酬率曲線、報酬率統計、夏普率、近期部位、近期換股日..., 這些數據可以用來進行多策略彙整的應用喔!
| PARAMETER | DESCRIPTION |
|---|---|
api_token
|
若未帶入finlab模組的api_token,會自動跳出GUI頁面, 複製網頁內的api_token貼至輸入欄位即可。
TYPE:
|
Returns: (dict): strategies data Response detail:
``` py
{
strategy1:{
'asset_type': '',
'drawdown_details': {
'2015-06-04': {
'End': '2015-11-03',
'Length': 152,
'drawdown': -0.19879090089478024
},
...
},
'fee_ratio': 0.000475,
'last_trading_date': '2022-06-10',
'last_updated': 'Sun, 03 Jul 2022 12:02:27 GMT',
'ndays_return': {
'1': -0.01132480035770611,
'10': -0.0014737286933147464,
'20': -0.06658015749110646,
'5': -0.002292995729485159,
'60': -0.010108700314771735
},
'next_trading_date': '2022-06-10',
'positions': {
'1413 宏洲': {
'entry_date': '2022-05-10',
'entry_price': 10.05,
'exit_date': '',
'next_weight': 0.1,
'return': -0.010945273631840613,
'status': '買進',
'weight': 0.1479332345384493
},
'last_updated': 'Sun, 03 Jul 2022 12:02:27 GMT',
'next_trading_date': '2022-06-10',
'trade_at': 'open',
'update_date': '2022-06-10'
},
'return_table': {
'2014': {
'Apr': 0.0,
'Aug': 0.06315180932606546,
'Dec': 0.0537589857541485,
'Feb': 0.0,
'Jan': 0.0,
'Jul': 0.02937490104459939,
'Jun': 0.01367930162104769,
'Mar': 0.0,
'May': 0.0,
'Nov': -0.0014734320286596825,
'Oct': -0.045082529665408266,
'Sep': 0.04630906972509852,
'YTD': 0.16626214846456966
},
...
},
'returns': {
'time': [
'2014-06-10',
'2014-06-11',
'2014-06-12',
...
],
'value': [
100,
99.9,
100.2,
...
]
},
'stats': {
'avg_down_month': -0.03304015302646822,
'avg_drawdown': -0.0238021414698247,
'avg_drawdown_days': 19.77952755905512,
'avg_up_month': 0.05293384465715908,
'cagr': 0.33236021285588846,
'calmar': 1.65261094975066,
'daily_kurt': 4.008888367138843,
'daily_mean': 0.3090784769257415,
'daily_sharpe': 1.747909002374217,
'daily_skew': -0.6966018726321078,
'daily_sortino': 2.8300677082214034,
...
},
'tax_ratio': 0.003,
'trade_at': 'open',
'update_date': '2022-06-10'
},
strategy2:{...},
...}
```
FAQ
Q: How do I find out what data is available for download?
Method 1: Use search()
Method 2: Check the online database Visit the FinLab Database Catalog to browse the full list.
Q: Data download is slow, what can I do?
# Method 1: Limit time range
data.truncate_start = '2020-01-01'
# Method 2: Use cache (second call will be fast)
close = data.get('price:收盤價') # First call is slow
close = data.get('price:收盤價') # Second call is fast (uses cache)
# Method 3: Use universe to limit stock count
with data.universe(size=100):
close = data.get('price:收盤價') # Only downloads 100 stocks
Q: KeyError: 'price:收盤價' - what should I do?
Possible causes:
1. Not logged in - Run finlab.login() or finlab.login('YOUR_TOKEN')
2. Incorrect field name - Use data.search('收盤') to verify the correct name
3. Invalid API token - Re-obtain the token
import finlab
# Check if logged in
try:
token, token_type = finlab.get_token()
print(f"Logged in ({token_type})")
except:
print("Not logged in, please run finlab.login()")
Q: How do I download US stock data?
from finlab import data
# US stock closing prices
us_close = data.get('price:close', market='us')
# Search US stock fields
us_fields = data.search(market='us')
Q: What about missing values in the data?
close = data.get('price:收盤價')
# Check missing values
print(f"Missing value ratio: {close.isna().sum().sum() / close.size:.2%}")
# Fill missing values
close_filled = close.fillna(method='ffill') # Forward fill
# Or drop stocks with too many missing values
close_clean = close.dropna(axis=1, thresh=len(close)*0.8) # Keep stocks with 80%+ data
Q: How can I save memory?
# Method 1: Limit time range
data.truncate_start = '2020-01-01'
# Method 2: Process in batches
all_stocks = close.columns
for batch in [all_stocks[i:i+100] for i in range(0, len(all_stocks), 100)]:
batch_close = close[batch]
# Process 100 stocks at a time...
# Method 3: Only download the fields you need
# Avoid calling data.get() for too many tables at once
Resources
- Data Download Detailed Tutorial - Complete usage guide
- FinLab Database Catalog - All available data tables
- Quick Start Guide - Getting started
- FAQ - More troubleshooting