Skip to content

Data

Retrieve key stock market data with a single line of code. Most commonly used financial and market data can be accessed directly.

Downloading Data

Use the data.get function to download over 10 years of historical records for thousands of stocks at once:

from finlab import data
close = data.get('price:收盤價')
date 1001 ... 2330
2007-04-23 39.65 ... 38.3
... ... ... ...
2023-05-02 39.85 ... 38.85
from finlab import data
data.set_market('us')
close = data.get('price:adj_close')
date AAPL ... TSLA
2015-01-02 27.33 ... 14.62
... ... ... ...
2023-05-02 169.59 ... 161.83

The return type is FinlabDataFrame (an extended Pandas DataFrame). If you are familiar with Pandas, you can get started right away; beginners may want to read the Pandas 10-minute guide first.

  • Vertical axis (rows): closing prices on market trading dates.
  • Horizontal axis (columns): stock symbols, making it very convenient for building stock screening strategies.

How to skip the login prompt?

You can log in programmatically (examples below) to bypass the GUI authentication.

What other data is available?

Visit the Database Catalog to find available datasets and their key names.

Free data has a shorter range

Free users can download most datasets for backtesting, but recent data is not included. Upgrade to VIP to access the latest data.

Use the AI Assistant to find data

Not sure about the dataset name? After installing the FinLab Skill in your AI coding assistant, simply ask "What revenue-related datasets are available?" to get the correct data.get() parameters and usage examples.

Search Available Fields (data.search)

Use data.search(keyword=None, market='tw') to search for available field names. The return format is table:column.

  • keyword: keyword (case-insensitive); pass None to list all
  • market: 'tw' (default, Taiwan stocks), 'us' (US stocks), 'all' (both)
  • An invalid market value raises a ValueError
from finlab import data

# List all Taiwan market fields (default)
all_tw = data.search()

# Taiwan keyword search
close_tw = data.search('收盤')

# US market keyword search
close_us = data.search('close', market='us')

# Search all markets
price_all = data.search('price', market='all')

Catalog source: Firestore data_categories/finlab_tw_stock (Taiwan) and data_categories/finlab_us_stock (US), with LRU caching.

Automatic Login to Retrieve Historical Data

Please log in before downloading. In Jupyter/Colab, a login window will pop up automatically; for VSCode or standalone scripts, programmatic login is recommended.

Go to the Member Area to obtain your API token, then log in:

import finlab
finlab.login('YOUR_API_TOKEN')

A more secure approach

You can use environment variables to avoid exposing your API_TOKEN. The following setup needs to be repeated each time you open a terminal. If you are not a software engineering professional, you may skip this step.

Open the command prompt and enter:

set FINLAB_API_TOKEN=<YOUR_API_TOKEN>

Open a terminal:

export FINLAB_API_TOKEN=<YOUR_API_TOKEN>

Market and Sector Scope

You can use data.universe to limit the data scope. For example, to get the closing prices of the "Cement" sector for listed and OTC stocks:

from finlab import data

with data.universe(market='TSE_OTC', category=['水泥工業']):

    price = data.get('price:收盤價')

This retrieves the closing price data for cement industry stocks in the TSE and OTC markets:

Market Scope (market)

For the Taiwan stock market, the available market scope options are:

Market Code Description
ALL All markets, including TSE, OTC, Emerging, and public offerings
TSE Listed stocks (Taiwan Stock Exchange)
OTC OTC stocks (Taipei Exchange)
TSE_OTC Listed + OTC stocks
ETF Exchange-Traded Funds

ETF-related categories:

Category Description
domestic_etf ETFs with Taiwan stocks as constituents
foreign_etf ETFs with foreign assets as constituents
leveraged_etf Leveraged ETFs
vanilla_futures_etf Non-leveraged futures ETFs
leveraged_futures_etf Leveraged futures ETFs

Sector Scope (category)

The sector scope is classified by industry. Available sectors include:

光電業 其他電子業 化學工業 半導體 塑膠工業 存託憑證 建材營造 文化創意業 橡膠工業 水泥工業 汽車工業 油電燃氣業 玻璃陶瓷 生技醫療 生技醫療業 紡織纖維 航運業 觀光事業 貿易百貨 資訊服務業 農業科技 通信網路業 造紙工業 金融 鋼鐵工業 電器電纜 電子商務 電子通路業 電子零組件 電機機械 電腦及週邊 食品工業

Disable fuzzy matching

Using regex for selection: Since the program uses regex for matching, it performs fuzzy matching. When the category is "其他" (Other), it will select all categories containing "其他", including "其他證券" (Other Securities). If you want to match "其他" exactly without including "其他證券", use the following approach:

with universe('TSE_OTC', ['^其他$']):
    close_subset = data.get('price:收盤價')
    print(close_subset)

This explicitly specifies that only categories with exactly "其他" between the start and end anchors will be selected.

Exclude Categories (exclude_category)

To exclude specific sectors from the scope, use the exclude_category parameter. This parameter accepts a string or list and also uses regex for fuzzy matching.

from finlab import data

# Select only TSE/OTC "Cement" related sectors, excluding "Finance" related sectors
with data.universe(market='TSE_OTC', category=['水泥工業'], exclude_category=['金融']):
    price = data.get('price:收盤價')

Execution order

exclude_category defaults to None. When both category and exclude_category are set, the system first selects the set matching category, then applies the exclude_category exclusion.

Common Errors and Solutions

Error 1: KeyError - Dataset Not Found

Symptom: data.get() raises a KeyError

close = data.get('price:收盤')  # Wrong: incorrect field name
# KeyError: 'price:收盤'

Causes: - Dataset name is misspelled (the correct name is "收盤價", not "收盤") - The dataset does not exist in the database - API Token is not set or is invalid, preventing data access

Solution:

from finlab import data

# Method 1: Search for the correct field name first
matching_fields = data.search('收盤')
print(matching_fields)
# Output: ['price:收盤價', 'price:成交股數', ...]

# Method 2: Use try-except to handle the error
try:
    close = data.get('price:收盤價')
    print(f"Data downloaded successfully, range: {close.index[0]} ~ {close.index[-1]}")
except KeyError as e:
    print(f"Dataset not found: {e}")
    print("Use data.search('keyword') to find the correct field name")
    print("Or visit https://ai.finlab.tw/database to browse available datasets")
except Exception as e:
    print(f"Download failed: {e}")
    print("Please check:")
    print("1. Is the API Token set? (finlab.login('YOUR_TOKEN'))")
    print("2. Is the network connection working?")
    print("3. Do you have a valid VIP membership?")

Error 2: Empty DataFrame

Symptom: Download succeeds but the DataFrame has no data

close = data.get('price:收盤價')
print(close)
# Empty DataFrame
# Columns: []
# Index: []

Causes: - Filtering conditions in data.universe() are too strict (e.g., sector does not exist) - Free user's data is outside the available range - The data source temporarily has no data

Solution:

from finlab import data

# Check whether the download result is empty
try:
    with data.universe(market='TSE', category=['不存在的類股']):
        close = data.get('price:收盤價')

    # Check data completeness
    if close.empty:
        print("Warning: downloaded data is empty")
        print("Possible reasons:")
        print("1. Universe filter conditions are too strict (sector name does not exist)")
        print("2. The dataset itself has no data")
        raise ValueError("Downloaded data is empty, please check filter conditions")

    if close.shape[0] < 10:
        print(f"Warning: too few data rows (only {close.shape[0]})")

    if close.shape[1] < 10:
        print(f"Warning: too few stocks (only {close.shape[1]})")

    print(f"Data is normal: {close.shape[0]} trading days, {close.shape[1]} stocks")

except ValueError as e:
    print(f"{e}")
    # Check available sector list
    print("\nAvailable sectors:")
    print("水泥工業, 塑膠工業, 半導體, 電腦及週邊...")
    print("See the full list in the Sector Scope (category) section")

Error 3: Network Timeout

Symptom: No response for a long time when downloading, eventually throws Timeout or connection error

close = data.get('price:收盤價')
# requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='...', port=443): Read timed out.

Causes: - Unstable network connection - Data volume is too large (e.g., full market 10+ years of data) - Server temporarily unresponsive

Solution:

from finlab import data
import time

# Method 1: Shorten the data range
import finlab
finlab.truncate_start = '2020-01-01'  # Only download data after 2020

try:
    close = data.get('price:收盤價')
except Exception as e:
    if "timeout" in str(e).lower() or "timed out" in str(e).lower():
        print("Network timeout, attempting retry...")
        time.sleep(5)  # Wait 5 seconds
        try:
            close = data.get('price:收盤價')
            print("Retry successful")
        except Exception as retry_error:
            print(f"Retry failed: {retry_error}")
            print("Suggestions:")
            print("1. Check network connection")
            print("2. Use finlab.truncate_start to shorten data range")
            print("3. Try again later")
            raise
    else:
        raise

# Method 2: Use local cache (if previously downloaded)
# finlab automatically caches downloaded data; the second run reads from cache

Error 4: API Token Authentication Failure

Symptom: Unable to log in or permission error when downloading data

import finlab
finlab.login('INVALID_TOKEN')
# AuthenticationError: Invalid API token

Causes: - API Token is incorrect or expired - Token not yet activated (newly registered but not verified) - Insufficient account permissions (free user accessing VIP-only data)

Solution:

import finlab
import os

# Method 1: Environment variable approach (recommended, avoids exposure)
token = os.environ.get('FINLAB_API_TOKEN')
if not token:
    print("Environment variable FINLAB_API_TOKEN is not set")
    print("Please run:")
    print("  MacOS/Linux: export FINLAB_API_TOKEN='YOUR_TOKEN'")
    print("  Windows: set FINLAB_API_TOKEN=YOUR_TOKEN")
    exit(1)

try:
    finlab.login(token)
    print("Login successful")

    # Verify data access
    from finlab import data
    close = data.get('price:收盤價')
    print(f"Data access is working, latest date: {close.index[-1]}")

except Exception as e:
    print(f"Login failed: {e}")
    print("\nPlease check:")
    print("1. Is the API Token correct? (Get it at https://ai.finlab.tw/member_info)")
    print("2. Is the account a valid VIP member?")
    print("3. Has the Token been activated? (New registrations need email verification)")
    print("4. Is the network connection working?")

# Method 2: Check membership level (determine if latest data is accessible)
close = data.get('price:收盤價')
if close.index[-1].year < 2024:
    print("You are currently using the free version, which only has access to older data")
    print("Upgrade to VIP for the latest data: https://ai.finlab.tw/pricing")

Reference Resources