finlab.dataframe
FinlabDataFrame extends Pandas DataFrame with automatic data alignment, technical indicator computation, and stock selection utilities.
Use Cases
- Automatically align data of different frequencies (daily, weekly, monthly, quarterly)
- Quickly perform stock selection (ranking, top-N filtering)
- Compute moving averages and technical indicators
- Manage entry/exit signals (hold until stop-loss/take-profit)
- Industry neutralization
- Cross-sectional factor preprocessing (
df.cs: rank, winsorize, z-score) - Within-sector transforms and aggregation (
df.sector: sector rank, mean, std) - Portfolio weight construction (
df.weight: inverse volatility, risk parity, drawdown control)
Quick Examples
Auto-Alignment and Stock Selection
from finlab import data
# data.get() returns FinlabDataFrame
close = data.get('price:收盤價') # Daily
revenue = data.get('monthly_revenue:當月營收') # Monthly
# Automatic date alignment (revenue is forward-filled to daily frequency)
position = (close > close.average(20)) & (revenue > 1e8)
Ranking and Filtering
from finlab import data
close = data.get('price:收盤價')
marketcap = data.get('etl:market_value')
# Select the 30 smallest by market cap
small_cap = marketcap.is_smallest(30)
# Select the 20 largest by price momentum
momentum = (close / close.shift(20)).is_largest(20)
# Combine strategies
position = small_cap & momentum
Position Management
from finlab import data
from finlab.backtest import sim
close = data.get('price:收盤價')
position = close > close.average(20)
# Hold until 10% stop-loss or 20% take-profit
position_with_stops = position.hold_until(stop_loss=0.1, take_profit=0.2)
# Backtest
report = sim(position_with_stops, resample='M')
report.display()
FinlabDataFrame vs pandas DataFrame
| Feature | pandas DataFrame | FinlabDataFrame |
|---|---|---|
| Basic operations | Fully supported | Fully supported (inherits pandas) |
| Auto-alignment | Requires manual handling | Automatically aligns different frequency data |
| Stock selection | Write your own | Built-in is_largest(), is_smallest() |
| Technical indicators | Requires extra packages | Built-in average(), rise(), fall() |
| Position management | Write your own | Built-in hold_until(), exit_when() |
| Industry analysis | Write your own | Built-in industry_rank(), neutralize_industry() |
| Factor preprocessing | Write your own | Built-in df.cs cross-sectional transforms |
| Within-sector analysis | Write your own | Built-in df.sector sector transforms & aggregation |
| Weight construction | Write your own | Built-in df.weight with 8 weighting methods |
API Reference
FinlabDataFrame
finlab.dataframe.FinlabDataFrame
Bases: IndexConversionMixin, SignalMixin, NeutralizeMixin, IndustryMixin, DataFrame
回測語法糖
除了使用熟悉的 Pandas 語法外,我們也提供很多語法糖,讓大家開發程式時,可以用簡易的語法完成複雜的功能,讓開發策略更簡潔!
我們將所有的語法糖包裹在 FinlabDataFrame 中,用起來跟 pd.DataFrame 一樣,但是多了很多功能!
只要使用 finlab.data.get() 所獲得的資料,皆為 FinlabDataFrame 格式,
接下來我們就來看看, FinlabDataFrame 有哪些好用的語法糖吧!
當資料日期沒有對齊(例如: 財報 vs 收盤價 vs 月報)時,在使用以下運算符號:
+, -, *, /, >, >=, ==, <, <=, &, |, ~,
不需要先將資料對齊,因為 FinlabDataFrame 會自動幫你處理,以下是示意圖。

以下是範例:cond1 與 cond2 分別為「每天」,和「每季」的資料,假如要取交集的時間,可以用以下語法:
from finlab import data
# 取得 FinlabDataFrame
close = data.get('price:收盤價')
roa = data.get('fundamental_features:ROA稅後息前')
# 運算兩個選股條件交集
cond1 = close > 37
cond2 = roa > 0
cond_1_2 = cond1 & cond2
cond1 跟 cond2 訊號的頻率雖然不相同,
但是由於 cond1 跟 cond2 是 FinlabDataFrame,所以可以直接取交集,而不用處理資料頻率對齊的問題。
總結來說,FinlabDataFrame 與一般 dataframe 唯二不同之處:
1. 多了一些 method,如df.is_largest(), df.sustain()...等。
2. 在做四則運算、不等式運算前,會將 df1、df2 的 index 取聯集,column 取交集。
average
取 n 筆移動平均
若股票在時間窗格內,有 N/2 筆 NaN,則會產生 NaN。 Args: n (positive-int): 設定移動窗格數。 Returns: (pd.DataFrame): data Examples: 股價在均線之上
只需要簡單的語法,就可以將其中一部分的訊號繪製出來檢查:import matplotlib.pyplot as plt
close.loc['2021', '2330'].plot()
sma.loc['2021', '2330'].plot()
cond.loc['2021', '2330'].mul(20).add(500).plot()
plt.legend(['close', 'sma', 'cond'])
fall
is_largest
is_smallest
lazy
Wrap this DataFrame into a lazy computation handle.
Returns a LazyWide proxy that records operations without
executing them. Call .rebalance(freq) or .collect()
on the result to trigger execution at specific dates only.
Example::
close = data.get('price:收盤價')
position = (
close.lazy()
.rolling(60).mean()
.rank(axis=1, pct=True)
.is_largest(10)
.rebalance('M')
)
quantile_row
rank
Cross-sectional or time-series ranking.
| PARAMETER | DESCRIPTION |
|---|---|
valid
|
只有 valid 為 True 的 cell 才參與排名。 valid 為 False/NaN 的 cell 在排名前設為 NaN, 因此不影響 pct=True 的分母。 常見用法:fillna 後傳入原始 notna() mask, 避免新上市或歷史不足的股票污染百分位排名。
TYPE:
|
rebalance
Downsample to keep only the last observation per period.
Equivalent to self.resample(freq).last().
| PARAMETER | DESCRIPTION |
|---|---|
freq
|
Pandas frequency string, e.g.
TYPE:
|
Key FinlabDataFrame Methods:
Stock Selection and Ranking
| Method | Description | Example |
|---|---|---|
is_largest(n) |
Select the N largest values | marketcap.is_largest(30) |
is_smallest(n) |
Select the N smallest values | pe_ratio.is_smallest(20) |
rank() |
Compute ranking (supports industry ranking) | close.rank(pct=True) |
industry_rank() |
Ranking within industry | close.industry_rank() |
Technical Indicators
| Method | Description | Example |
|---|---|---|
average(n) |
N-period moving average | close.average(20) |
rise(n) |
Consecutive N-period rise | close.rise(3) |
fall(n) |
Consecutive N-period fall | close.fall(2) |
sustain(n) |
Condition sustained for N periods | (close > ma20).sustain(5) |
Position Management
| Method | Description | Example |
|---|---|---|
hold_until() |
Hold until stop-loss/take-profit | position.hold_until(stop_loss=0.1) |
exit_when() |
Exit on condition | position.exit_when(close < ma20) |
entry_price() |
Get entry price | position.entry_price('close') |
Industry Analysis
| Method | Description | Example |
|---|---|---|
neutralize_industry() |
Industry neutralization | factor.neutralize_industry() |
groupby_category() |
Group by industry | close.groupby_category().mean() |
average()
Compute moving average.
Usage Examples:
from finlab import data
close = data.get('price:收盤價')
# 20-day moving average
ma20 = close.average(20)
# 60-day moving average
ma60 = close.average(60)
# Moving average crossover strategy
golden_cross = ma20 > ma60
Difference from pandas rolling()
average(20)is equivalent torolling(20).mean()- More concise and readable
- Return value is still a FinlabDataFrame, allowing method chaining
is_largest() / is_smallest()
Select top/bottom N by ranking.
Usage Examples:
from finlab import data
close = data.get('price:收盤價')
marketcap = data.get('etl:market_value')
# Select the 30 smallest by market cap (small-cap stocks)
small_cap = marketcap.is_smallest(30)
# Select the 20 largest by momentum (momentum stocks)
momentum = (close / close.shift(20)).is_largest(20)
# Combine: momentum stocks among small-caps
position = small_cap & momentum
Notes
is_largest(n)re-ranks every period (daily/monthly)- If fewer than N stocks are available, all are selected
- Returns a Boolean DataFrame (True = selected)
rank()
Compute ranking (supports percentile ranking).
Usage Examples:
from finlab import data
close = data.get('price:收盤價')
marketcap = data.get('etl:market_value')
# Compute percentile ranking (0-1)
cap_rank = marketcap.rank(pct=True, axis=1)
# Select top 30% by market cap
position = cap_rank > 0.7
# Or use is_largest (more intuitive)
position = marketcap.is_largest(int(len(marketcap.columns) * 0.3))
industry_rank()
Ranking within industry.
Usage Examples:
from finlab import data
close = data.get('price:收盤價')
# Compute within-industry ranking (0-1)
industry_rank = close.industry_rank()
# Select top 30% performers within each industry
position = industry_rank > 0.7
Industry Classification Source
- Uses TEJ industry classification
- Automatically handles industry reclassification
rise() / fall()
Detect consecutive rises/falls.
Usage Examples:
from finlab import data
close = data.get('price:收盤價')
# 3 consecutive days of rising
rising = close.rise(3)
# 2 consecutive days of falling
falling = close.fall(2)
# Strategy: reverse signal after consecutive rises
position = ~rising # Inverse signal
sustain()
Check if a condition is sustained for N periods.
Usage Examples:
from finlab import data
close = data.get('price:收盤價')
ma20 = close.average(20)
# Price stays above 20-day MA for 5 consecutive days
sustained_above_ma = (close > ma20).sustain(5)
# Strategy
position = sustained_above_ma
sustain vs rise
rise(n): value increases for n consecutive periodssustain(n): condition is True for n consecutive periods
hold_until()
Hold position until stop-loss/take-profit triggers.
Usage Examples:
from finlab import data
from finlab.backtest import sim
close = data.get('price:收盤價')
position = close > close.average(20)
# Basic usage: 10% stop-loss
position_with_sl = position.hold_until(stop_loss=0.1)
# 10% stop-loss + 20% take-profit
position_with_stops = position.hold_until(stop_loss=0.1, take_profit=0.2)
# Backtest
report = sim(position_with_stops, resample='M')
report.display()
# Trailing stop requires setting in sim()
report = sim(position, resample='M', trail_stop=0.2)
Key Concepts
hold_until()extends the holding period until the trigger condition- Stop-loss/take-profit are calculated based on the entry price
- Trailing stop must be set via
sim(trail_stop=...), exits when price drops a specified percentage from the peak
exit_when()
Exit when a specific condition is met.
Usage Examples:
from finlab import data
close = data.get('price:收盤價')
ma20 = close.average(20)
ma60 = close.average(60)
# Entry: short MA crosses above long MA
entry = ma20 > ma60
# Exit: short MA crosses below long MA
exit_signal = ma20 < ma60
# Hold until exit signal
position = entry.exit_when(exit_signal)
exit_when vs hold_until
exit_when(): Custom exit conditions (e.g., technical indicators)hold_until(): Fixed stop-loss/take-profit percentages
neutralize_industry()
Industry neutralization - removes industry factor effects.
Usage Examples:
from finlab import data
# Raw factor (with industry influence)
factor = data.get('fundamental_features:股東權益報酬率')
# Industry neutralization (removes industry average)
neutral_factor = factor.neutralize_industry()
# Use neutralized factor
position = neutral_factor.is_largest(30)
Why Neutralize?
- Some industries naturally have higher ROE (e.g., financial sector)
- Neutralization avoids over-concentration in a single industry
- Improves strategy stability
Portfolio Weight Construction df.weight
The df.weight accessor provides methods for constructing and adjusting portfolio weights. All methods return FinlabDataFrame and support method chaining.
Method Overview
| Method | Description | Example |
|---|---|---|
cap_industry(max_weight) |
Cap industry weight, redistribute excess proportionally | w.weight.cap_industry(0.3) |
clip_by_volume(total_fund, max_participation_ratio, volume, window) |
Clip stock weight by average daily volume | w.weight.clip_by_volume(1e7) |
inverse_volatility(window, price) |
Inverse volatility weighting — low-vol stocks get higher weight | w.weight.inverse_volatility(60) |
risk_parity(window, price) |
Risk parity weighting — equal risk contribution per holding | w.weight.risk_parity(60) |
correlation(window, price, diversify) |
Adjust weights by correlation — diversify or concentrate | w.weight.correlation(diversify=True) |
target_volatility(target, window, price) |
Scale weights to achieve target annualized volatility | w.weight.target_volatility(0.15) |
limit_turnover(max_turnover) |
Cap two-way turnover between rebalances | w.weight.limit_turnover(0.3) |
drawdown_control(max_drawdown, price) |
Reduce exposure when drawdown exceeds threshold | w.weight.drawdown_control(0.15) |
Usage Example
from finlab import data
from finlab.backtest import sim
close = data.get('price:收盤價')
marketcap = data.get('etl:market_value')
# Stock selection
position = marketcap.is_smallest(30)
# Chain weight construction
weighted = (
position
.weight.inverse_volatility(window=60) # Inverse volatility weighting
.weight.cap_industry(max_weight=0.3) # Industry cap at 30%
.weight.limit_turnover(max_turnover=0.3) # Turnover cap at 30%
.weight.drawdown_control(max_drawdown=0.15) # Drawdown control at 15%
)
report = sim(weighted, resample='M')
report.display()
Method ordering
- Apply weighting first (inverse_volatility / risk_parity), then constraints (cap_industry / clip_by_volume)
limit_turnoveranddrawdown_controlare typically applied last
Cross-Sectional Transforms df.cs
The df.cs (cross-sectional) accessor provides transforms computed across all stocks on each date.
Method Overview
| Method | Description | Example |
|---|---|---|
rank() |
Percentile rank (0, 1] | factor.cs.rank() |
winsorize(lower, upper) |
Percentile winsorization | factor.cs.winsorize(0.05, 0.95) |
bucket(n) |
Equal-quantile bucketing (1..n) | factor.cs.bucket(5) |
zscore() |
Standardization (mean 0, std 1) | factor.cs.zscore() |
demean() |
Subtract cross-sectional mean | factor.cs.demean() |
Usage Example
from finlab import data
roe = data.get('fundamental_features:股東權益報酬率')
# Factor preprocessing: winsorize + standardize
clean_roe = roe.cs.winsorize(0.05, 0.95).cs.zscore()
# Bucket into 5 groups for backtesting
group = roe.cs.bucket(5)
long_group = group == 5 # Long the top quintile
# Percentile rank stock selection
position = roe.cs.rank() > 0.8 # Select top 20%
Sector-Relative Transforms & Aggregation df.sector
The df.sector accessor provides two categories of functionality:
- Within-sector transforms: Same 5 methods as
df.cs(rank / winsorize / bucket / zscore / demean), computed within each sector - Within-sector aggregation: mean / std / median / sum / min / max / count — sector-level statistics
Transform Methods
| Method | Description | Example |
|---|---|---|
rank() |
Within-sector percentile rank | factor.sector.rank() |
winsorize(lower, upper) |
Within-sector winsorization | factor.sector.winsorize(0.05, 0.95) |
bucket(n) |
Within-sector bucketing | factor.sector.bucket(3) |
zscore() |
Within-sector standardization | factor.sector.zscore() |
demean() |
Within-sector demeaning | factor.sector.demean() |
Aggregation Methods
| Method | Description | Example |
|---|---|---|
mean() |
Sector mean | roe.sector.mean() |
std(ddof=1) |
Sector standard deviation | roe.sector.std() |
median() |
Sector median | roe.sector.median() |
sum() |
Sector sum | revenue.sector.sum() |
min() |
Sector minimum | close.sector.min() |
max() |
Sector maximum | close.sector.max() |
count() |
Count of non-NaN values in sector | roe.sector.count() |
Usage Example
from finlab import data
roe = data.get('fundamental_features:股東權益報酬率')
close = data.get('price:收盤價')
# Within-sector rank (top 20% ROE within each sector)
position = roe.sector.rank() > 0.8
# Within-sector standardization (removes sector bias)
neutralized_roe = roe.sector.zscore()
# Compare against sector average
sector_avg_roe = roe.sector.mean()
above_avg = roe > sector_avg_roe # Stocks above sector average
sector vs neutralize_industry()
df.sector.zscore()achieves the same effect asneutralize_industry()(industry neutralization)df.sectoroffers more flexibility: ranking, bucketing, aggregation, and more
Complete Example: Multi-Filter Strategy
from finlab import data
from finlab.backtest import sim
# Step 1: Load data
close = data.get('price:收盤價')
marketcap = data.get('etl:market_value')
revenue = data.get('monthly_revenue:當月營收')
roe = data.get('fundamental_features:股東權益報酬率')
# Step 2: Define filter conditions
cond1 = marketcap.is_smallest(100) # Small-cap
cond2 = (revenue.average(3) / revenue.average(12)) > 1.2 # Revenue growth
cond3 = roe.rank(pct=True, axis=1) > 0.7 # Top 30% ROE
cond4 = (close > close.average(20)).sustain(3) # Above 20MA for 3 days
# Step 3: Combine conditions
position = cond1 & cond2 & cond3 & cond4
# Step 4: Add stop-loss/take-profit
position = position.hold_until(stop_loss=0.1, take_profit=0.2)
# Step 5: Limit number of holdings
position = position.is_largest(20) # Max 20 stocks
# Step 6: Backtest
report = sim(position, resample='M')
report.display()
# Step 7: Analyze performance
metrics = report.get_metrics()
print(f"Annual return: {metrics['annual_return']:.2%}")
print(f"Sharpe ratio: {metrics['daily_sharpe']:.2f}")
print(f"Max drawdown: {metrics['max_drawdown']:.2%}")
FAQ
Q: How does FinlabDataFrame automatically align different frequency data?
from finlab import data
# Daily data
close = data.get('price:收盤價') # Updated daily
print(close.index[0]) # 2010-01-04
# Monthly data
revenue = data.get('monthly_revenue:當月營收') # Published around the 10th of each month
print(revenue.index[0]) # 2010-01-10
# Direct operations auto-align (revenue is forward-filled to daily frequency)
position = (close > 100) & (revenue > 1e8)
# position is daily data, revenue has been auto-filled
Alignment Rules: - Low frequency to high frequency: Forward fill - Quarterly reports (financial statements): Automatically delayed to the financial statement announcement date
Q: How do I chain multiple FinlabDataFrame methods?
from finlab import data
close = data.get('price:收盤價')
# Method chaining (similar to pandas method chaining)
position = (
close
.average(20) # Compute 20MA
.rank(pct=True, axis=1) # Rank
.is_largest(30) # Select top 30
)
# Equivalent to
ma20 = close.average(20)
ranked = ma20.rank(pct=True, axis=1)
position = ranked.is_largest(30)
Q: What is the difference between is_largest() and rank() + threshold?
from finlab import data
close = data.get('price:收盤價')
# Method 1: is_largest (fixed count)
position1 = close.is_largest(30) # Always selects 30 stocks
# Method 2: rank + threshold (fixed percentage)
position2 = close.rank(pct=True, axis=1) > 0.8 # Selects top 20% (count varies)
# Difference:
# - Method 1: Fixed number of holdings (suitable for fixed capital allocation)
# - Method 2: Variable number of holdings (adapts to market conditions)
Q: How does trailing stop (trail_stop) work?
Trailing stop must be set via the trail_stop parameter in sim() -- hold_until() does not support this feature.
from finlab import data
from finlab.backtest import sim
close = data.get('price:收盤價')
position = close > close.average(20)
# Example: Entry price is 100
# - stop_loss=0.1: Exit if price drops below 90
# - take_profit=0.2: Exit when price reaches 120 (fixed take-profit)
# - trail_stop=0.2: Exit when price drops 20% from peak (trailing stop)
# Scenario: Entry at 100 -> rises to 150 -> drops to 120
# - take_profit=0.2: Exits at 120, does not wait for 150
# - trail_stop=0.2: Exits when price drops 20% from peak 150 = 120
# Fixed take-profit
report_fixed = sim(position, resample='M', stop_loss=0.1, take_profit=0.2)
# Trailing stop
report_trailing = sim(position, resample='M', stop_loss=0.1, trail_stop=0.2)
Q: How do I use pandas methods on FinlabDataFrame?
from finlab import data
close = data.get('price:收盤價')
# FinlabDataFrame is fully compatible with pandas
close.rolling(20).mean() # pandas method
close.fillna(0) # pandas method
close.dropna() # pandas method
# All pandas methods work
# Results remain FinlabDataFrame (retaining extended features)
Resources
- Stock Selection Tutorial - FinlabDataFrame practical applications
- Backtesting Tutorial - Complete backtesting workflow
- Technical Indicator Selection Examples - Strategy examples
- pandas Official Documentation - pandas fundamentals
Args:
n (positive-int): 設定比較前第n筆低。
Returns:
(pd.DataFrame): data
Examples:
收盤價是否低於10日前股價
Args:
n (positive-int): 設定每列前 n 筆大的數值。
Returns:
(pd.DataFrame): data
Examples:
每季 ROA 前 10 名的股票