Compare commits
No commits in common. "master" and "find_closest_changes" have entirely different histories.
master
...
find_close
32
README.md
32
README.md
@ -1,28 +1,22 @@
|
|||||||
# PyFacts
|
# PyFacts
|
||||||
|
|
||||||
PyFacts stands for Python library for Financial analysis and computations on time series. It is a library which makes it simple to work with time series data.
|
PyFacts stands for Python library for Financial analysis and computations on time series. It is a library which makes it simple to work with time series data.
|
||||||
|
|
||||||
Most libraries, and languages like SQL, work with rows. Operations are performed by rows and not by dates. For instance, to calculate 1-year rolling returns in SQL, you are forced to use either a lag of 365/252 rows, leading to an approximation, or slow and cumbersome joins. PyFacts solves this by allowing you to work with dates and time intervals. Hence, to calculate 1-year returns, you will be specifying a lag of 1-year and the library will do the grunt work of finding the most appropriate observations to calculate these returns on.
|
Most libraries, and languages like SQL, work with rows. Operations are performed by rows and not by dates. For instance, to calculate 1-year rolling returns in SQL, you are forced to use either a lag of 365/252 rows, leading to an approximation, or slow and cumbersome joins. PyFacts solves this by allowing you to work with dates and time intervals. Hence, to calculate 1-year returns, you will be specifying a lag of 1-year and the library will do the grunt work of finding the most appropriate observations to calculate these returns on.
|
||||||
|
|
||||||
## The problem
|
## The problem
|
||||||
|
|
||||||
Libraries and languages usually don't allow comparison based on dates. Calculating month on month or year on year returns are always cumbersome as users are forced to rely on row lags. However, data always have inconsistencies, especially financial data. Markets don't work on weekends, there are off days, data doesn't get released on a few days a year, data availability is patchy when dealing with 40-year old data. All these problems are exacerbated when you are forced to make calculations using lag.
|
Libraries and languages usually don't allow comparison based on dates. Calculating month on month or year on year returns are always cumbersome as users are forced to rely on row lags. However, data always have inconsistencies, especially financial data. Markets don't work on weekends, there are off days, data doesn't get released on a few days a year, data availability is patchy when dealing with 40-year old data. All these problems are exacerbated when you are forced to make calculations using lag.
|
||||||
|
|
||||||
## The Solution
|
## The Solution
|
||||||
|
|
||||||
PyFacts aims to simplify things by allowing you to:
|
PyFacts aims to simplify things by allowing you to:
|
||||||
|
* Compare time-series data based on dates and time-period-based lag
|
||||||
- Compare time-series data based on dates and time-period-based lag
|
* Easy way to work around missing dates by taking the closest data points
|
||||||
- Easy way to work around missing dates by taking the closest data points
|
* Completing series with missing data points using forward fill and backward fill
|
||||||
- Completing series with missing data points using forward fill and backward fill
|
* Use friendly dates everywhere written as a simple string
|
||||||
- Use friendly dates everywhere written as a simple string
|
|
||||||
|
|
||||||
## Creating a time series
|
## Creating a time series
|
||||||
|
|
||||||
Time series data can be created from a dictionary, a list of lists/tuples/dicts, or by reading a csv file.
|
Time series data can be created from a dictionary, a list of lists/tuples/dicts, or by reading a csv file.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
```
|
```
|
||||||
>>> import pyfacts as pft
|
>>> import pyfacts as pft
|
||||||
|
|
||||||
@ -39,7 +33,6 @@ Example:
|
|||||||
```
|
```
|
||||||
|
|
||||||
### Sample usage
|
### Sample usage
|
||||||
|
|
||||||
```
|
```
|
||||||
>>> ts.calculate_returns(as_on='2021-04-01', return_period_unit='months', return_period_value=3, annual_compounded_returns=False)
|
>>> ts.calculate_returns(as_on='2021-04-01', return_period_unit='months', return_period_value=3, annual_compounded_returns=False)
|
||||||
(datetime.datetime(2021, 4, 1, 0, 0), 0.6)
|
(datetime.datetime(2021, 4, 1, 0, 0), 0.6)
|
||||||
@ -49,24 +42,21 @@ Example:
|
|||||||
```
|
```
|
||||||
|
|
||||||
### Working with dates
|
### Working with dates
|
||||||
|
|
||||||
With PyFacts, you never have to go into the hassle of creating datetime objects for your time series. PyFacts will parse any date passed to it as string. The default format is ISO format, i.e., YYYY-MM-DD. However, you can use your preferred format simply by specifying it in the options in datetime library compatible format, after importing the library. For example, to use DD-MM-YYY format:
|
With PyFacts, you never have to go into the hassle of creating datetime objects for your time series. PyFacts will parse any date passed to it as string. The default format is ISO format, i.e., YYYY-MM-DD. However, you can use your preferred format simply by specifying it in the options in datetime library compatible format, after importing the library. For example, to use DD-MM-YYY format:
|
||||||
|
|
||||||
```
|
```
|
||||||
>>> import pyfacts as pft
|
>>> import pyfacts as pft
|
||||||
>>> pft.PyfactsOptions.date_format = '%d-%m-%Y'
|
>>> pft.PyfactsOptions.date_format = '%d-%m-%Y'
|
||||||
```
|
```
|
||||||
|
|
||||||
Now the library will automatically parse all dates as DD-MM-YYYY
|
Now the library will automatically parse all dates as DD-MM-YYYY
|
||||||
|
|
||||||
If you happen to have any one situation where you need to use a different format, all methods accept a date_format parameter to override the default.
|
If you happen to have any one situation where you need to use a different format, all methods accept a date_format parameter to override the default.
|
||||||
|
|
||||||
### Working with multiple time series
|
|
||||||
|
|
||||||
|
### Working with multiple time series
|
||||||
While working with time series data, you will often need to perform calculations on the data. PyFacts supports all kinds of mathematical operations on time series.
|
While working with time series data, you will often need to perform calculations on the data. PyFacts supports all kinds of mathematical operations on time series.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
```
|
```
|
||||||
>>> import pyfacts as pft
|
>>> import pyfacts as pft
|
||||||
|
|
||||||
@ -93,7 +83,6 @@ TimeSeries([(datetime.datetime(2022, 1, 1, 0, 0), 0.1),
|
|||||||
Mathematical operations can also be done between time series as long as they have the same dates.
|
Mathematical operations can also be done between time series as long as they have the same dates.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
```
|
```
|
||||||
>>> import pyfacts as pft
|
>>> import pyfacts as pft
|
||||||
|
|
||||||
@ -121,7 +110,6 @@ TimeSeries([(datetime.datetime(2022, 1, 1, 0, 0), 1.0),
|
|||||||
However, if the dates are not in sync, PyFacts provides convenience methods for syncronising dates.
|
However, if the dates are not in sync, PyFacts provides convenience methods for syncronising dates.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
```
|
```
|
||||||
>>> import pyfacts as pft
|
>>> import pyfacts as pft
|
||||||
|
|
||||||
@ -158,7 +146,6 @@ TimeSeries([(datetime.datetime(2022, 1, 1, 0, 0), 20.0),
|
|||||||
Even if you need to perform calculations on data with different frequencies, PyFacts will let you easily handle this with the expand and shrink methods.
|
Even if you need to perform calculations on data with different frequencies, PyFacts will let you easily handle this with the expand and shrink methods.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
```
|
```
|
||||||
>>> data = [
|
>>> data = [
|
||||||
... ("2022-01-01", 10),
|
... ("2022-01-01", 10),
|
||||||
@ -189,7 +176,6 @@ TimeSeries([(datetime.datetime(2022, 1, 1, 0, 0), 10.0),
|
|||||||
If you want to shorten the timeframe of the data with an aggregation function, the transform method will help you out. Currently it supports sum and mean.
|
If you want to shorten the timeframe of the data with an aggregation function, the transform method will help you out. Currently it supports sum and mean.
|
||||||
|
|
||||||
Example:
|
Example:
|
||||||
|
|
||||||
```
|
```
|
||||||
>>> data = [
|
>>> data = [
|
||||||
... ("2022-01-01", 10),
|
... ("2022-01-01", 10),
|
||||||
@ -222,11 +208,11 @@ TimeSeries([(datetime.datetime(2022, 1, 1, 0, 0), 12.0),
|
|||||||
(datetime.datetime(2022, 10, 1, 0, 0), 30.0)], frequency='Q')
|
(datetime.datetime(2022, 10, 1, 0, 0), 30.0)], frequency='Q')
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
## To-do
|
## To-do
|
||||||
|
|
||||||
### Core features
|
### Core features
|
||||||
|
- [x] Add __setitem__
|
||||||
- [x] Add **setitem**
|
|
||||||
- [ ] Create emtpy TimeSeries object
|
- [ ] Create emtpy TimeSeries object
|
||||||
- [x] Read from CSV
|
- [x] Read from CSV
|
||||||
- [ ] Write to CSV
|
- [ ] Write to CSV
|
||||||
@ -234,20 +220,18 @@ TimeSeries([(datetime.datetime(2022, 1, 1, 0, 0), 12.0),
|
|||||||
- [x] Convert to list of tuples
|
- [x] Convert to list of tuples
|
||||||
|
|
||||||
### pyfacts features
|
### pyfacts features
|
||||||
|
|
||||||
- [x] Sync two TimeSeries
|
- [x] Sync two TimeSeries
|
||||||
- [x] Average rolling return
|
- [x] Average rolling return
|
||||||
- [x] Sharpe ratio
|
- [x] Sharpe ratio
|
||||||
- [x] Jensen's Alpha
|
- [x] Jensen's Alpha
|
||||||
- [x] Beta
|
- [x] Beta
|
||||||
- [x] Sortino ratio
|
- [ ] Sortino ratio
|
||||||
- [x] Correlation & R-squared
|
- [x] Correlation & R-squared
|
||||||
- [ ] Treynor ratio
|
- [ ] Treynor ratio
|
||||||
- [x] Max drawdown
|
- [x] Max drawdown
|
||||||
- [ ] Moving average
|
- [ ] Moving average
|
||||||
|
|
||||||
### Pending implementation
|
### Pending implementation
|
||||||
|
|
||||||
- [x] Use limit parameter in ffill and bfill
|
- [x] Use limit parameter in ffill and bfill
|
||||||
- [x] Implementation of ffill and bfill may be incorrect inside expand, check and correct
|
- [x] Implementation of ffill and bfill may be incorrect inside expand, check and correct
|
||||||
- [ ] Implement interpolation in expand
|
- [ ] Implement interpolation in expand
|
@ -2,26 +2,3 @@ from .core import *
|
|||||||
from .pyfacts import *
|
from .pyfacts import *
|
||||||
from .statistics import *
|
from .statistics import *
|
||||||
from .utils import *
|
from .utils import *
|
||||||
|
|
||||||
__author__ = "Gourav Kumar"
|
|
||||||
__email__ = "gouravkr@outlook.in"
|
|
||||||
__version__ = "0.0.1"
|
|
||||||
|
|
||||||
|
|
||||||
__doc__ = """
|
|
||||||
PyFacts stands for Python library for Financial analysis and computations on time series.
|
|
||||||
It is a library which makes it simple to work with time series data.
|
|
||||||
|
|
||||||
Most libraries, and languages like SQL, work with rows. Operations are performed by rows
|
|
||||||
and not by dates. For instance, to calculate 1-year rolling returns in SQL, you are forced
|
|
||||||
to use either a lag of 365/252 rows, leading to an approximation, or slow and cumbersome
|
|
||||||
joins. PyFacts solves this by allowing you to work with dates and time intervals. Hence,
|
|
||||||
to calculate 1-year returns, you will be specifying a lag of 1-year and the library will
|
|
||||||
do the grunt work of finding the most appropriate observations to calculate these returns on.
|
|
||||||
|
|
||||||
PyFacts aims to simplify things by allowing you to:
|
|
||||||
* Compare time-series data based on dates and time-period-based lag
|
|
||||||
* Easy way to work around missing dates by taking the closest data points
|
|
||||||
* Completing series with missing data points using forward fill and backward fill
|
|
||||||
* Use friendly dates everywhere written as a simple string
|
|
||||||
"""
|
|
||||||
|
@ -76,28 +76,29 @@ def create_date_series(
|
|||||||
if eomonth and frequency.days < AllFrequencies.M.days:
|
if eomonth and frequency.days < AllFrequencies.M.days:
|
||||||
raise ValueError(f"eomonth cannot be set to True if frequency is higher than {AllFrequencies.M.name}")
|
raise ValueError(f"eomonth cannot be set to True if frequency is higher than {AllFrequencies.M.name}")
|
||||||
|
|
||||||
|
if ensure_coverage:
|
||||||
|
if frequency.days == 1 and skip_weekends and end_date.weekday() > 4:
|
||||||
|
extend_by_days = 7 - end_date.weekday()
|
||||||
|
end_date += relativedelta(days=extend_by_days)
|
||||||
|
|
||||||
|
# TODO: Add code to ensure coverage for other frequencies as well
|
||||||
|
|
||||||
|
datediff = (end_date - start_date).days / frequency.days + 1
|
||||||
dates = []
|
dates = []
|
||||||
counter = 0
|
|
||||||
while counter < 100000:
|
for i in range(0, int(datediff)):
|
||||||
diff = {frequency.freq_type: frequency.value * counter}
|
diff = {frequency.freq_type: frequency.value * i}
|
||||||
date = start_date + relativedelta(**diff)
|
date = start_date + relativedelta(**diff)
|
||||||
|
|
||||||
if eomonth:
|
if eomonth:
|
||||||
date += relativedelta(months=1, day=1, days=-1)
|
replacement = {"month": date.month + 1} if date.month < 12 else {"year": date.year + 1}
|
||||||
|
date = date.replace(day=1).replace(**replacement) - relativedelta(days=1)
|
||||||
|
|
||||||
if date > end_date:
|
if date <= end_date:
|
||||||
if not ensure_coverage:
|
if frequency.days > 1 or not skip_weekends:
|
||||||
break
|
dates.append(date)
|
||||||
elif dates[-1] >= end_date:
|
elif date.weekday() < 5:
|
||||||
break
|
dates.append(date)
|
||||||
|
|
||||||
counter += 1
|
|
||||||
if frequency.days > 1 or not skip_weekends:
|
|
||||||
dates.append(date)
|
|
||||||
elif date.weekday() < 5:
|
|
||||||
dates.append(date)
|
|
||||||
else:
|
|
||||||
raise ValueError("Cannot generate a series containing more than 100000 dates")
|
|
||||||
|
|
||||||
return Series(dates, dtype="date")
|
return Series(dates, dtype="date")
|
||||||
|
|
||||||
@ -567,7 +568,6 @@ class TimeSeries(TimeSeriesCore):
|
|||||||
Parameters
|
Parameters
|
||||||
----------
|
----------
|
||||||
kwargs: parameters to be passed to the calculate_rolling_returns() function
|
kwargs: parameters to be passed to the calculate_rolling_returns() function
|
||||||
Refer TimeSeries.calculate_rolling_returns() method for more details
|
|
||||||
|
|
||||||
Returns
|
Returns
|
||||||
-------
|
-------
|
||||||
@ -805,12 +805,7 @@ class TimeSeries(TimeSeriesCore):
|
|||||||
return statistics.mean(self.values)
|
return statistics.mean(self.values)
|
||||||
|
|
||||||
def transform(
|
def transform(
|
||||||
self,
|
self, to_frequency: Literal["W", "M", "Q", "H", "Y"], method: Literal["sum", "mean"], eomonth: bool = False
|
||||||
to_frequency: Literal["W", "M", "Q", "H", "Y"],
|
|
||||||
method: Literal["sum", "mean"],
|
|
||||||
eomonth: bool = False,
|
|
||||||
ensure_coverage: bool = True,
|
|
||||||
anchor_date=Literal["start", "end"],
|
|
||||||
) -> TimeSeries:
|
) -> TimeSeries:
|
||||||
"""Transform a time series object into a lower frequency object with an aggregation function.
|
"""Transform a time series object into a lower frequency object with an aggregation function.
|
||||||
|
|
||||||
@ -850,33 +845,28 @@ class TimeSeries(TimeSeriesCore):
|
|||||||
|
|
||||||
dates = create_date_series(
|
dates = create_date_series(
|
||||||
self.start_date,
|
self.start_date,
|
||||||
self.end_date, # + relativedelta(days=to_frequency.days),
|
self.end_date
|
||||||
|
+ datetime.timedelta(to_frequency.days), # need extra date at the end for calculation of last value
|
||||||
to_frequency.symbol,
|
to_frequency.symbol,
|
||||||
ensure_coverage=ensure_coverage,
|
ensure_coverage=True,
|
||||||
eomonth=eomonth,
|
|
||||||
)
|
)
|
||||||
# prev_date = dates[0]
|
prev_date = dates[0]
|
||||||
|
|
||||||
new_ts_dict = {}
|
new_ts_dict = {}
|
||||||
for idx, date in enumerate(dates):
|
for date in dates[1:]:
|
||||||
if idx == 0:
|
cur_data = self[(self.dates >= prev_date) & (self.dates < date)]
|
||||||
cur_data = self[self.dates <= date]
|
|
||||||
else:
|
|
||||||
cur_data = self[(self.dates <= date) & (self.dates > dates[idx - 1])]
|
|
||||||
if method == "sum":
|
if method == "sum":
|
||||||
value = sum(cur_data.values)
|
value = sum(cur_data.values)
|
||||||
elif method == "mean":
|
elif method == "mean":
|
||||||
value = cur_data.mean()
|
value = cur_data.mean()
|
||||||
|
|
||||||
new_ts_dict.update({date: value})
|
new_ts_dict.update({prev_date: value})
|
||||||
# prev_date = date
|
prev_date = date
|
||||||
|
|
||||||
return self.__class__(new_ts_dict, to_frequency.symbol)
|
return self.__class__(new_ts_dict, to_frequency.symbol)
|
||||||
|
|
||||||
|
|
||||||
def _preprocess_csv(
|
def _preprocess_csv(file_path: str | pathlib.Path, delimiter: str = ",", encoding: str = "utf-8") -> List[list]:
|
||||||
file_path: str | pathlib.Path, delimiter: str = ",", encoding: str = "utf-8", **kwargs
|
|
||||||
) -> List[list]:
|
|
||||||
"""Preprocess csv data"""
|
"""Preprocess csv data"""
|
||||||
|
|
||||||
if isinstance(file_path, str):
|
if isinstance(file_path, str):
|
||||||
@ -886,7 +876,7 @@ def _preprocess_csv(
|
|||||||
raise ValueError("File not found. Check the file path")
|
raise ValueError("File not found. Check the file path")
|
||||||
|
|
||||||
with open(file_path, "r", encoding=encoding) as file:
|
with open(file_path, "r", encoding=encoding) as file:
|
||||||
reader: csv.reader = csv.reader(file, delimiter=delimiter, **kwargs)
|
reader: csv.reader = csv.reader(file, delimiter=delimiter)
|
||||||
csv_data: list = list(reader)
|
csv_data: list = list(reader)
|
||||||
|
|
||||||
csv_data = [i for i in csv_data if i] # remove blank rows
|
csv_data = [i for i in csv_data if i] # remove blank rows
|
||||||
@ -907,51 +897,8 @@ def read_csv(
|
|||||||
nrows: int = -1,
|
nrows: int = -1,
|
||||||
delimiter: str = ",",
|
delimiter: str = ",",
|
||||||
encoding: str = "utf-8",
|
encoding: str = "utf-8",
|
||||||
**kwargs,
|
|
||||||
) -> TimeSeries:
|
) -> TimeSeries:
|
||||||
"""Reads Time Series data directly from a CSV file
|
"""Reads Time Series data directly from a CSV file"""
|
||||||
|
|
||||||
Parameters
|
|
||||||
----------
|
|
||||||
csv_file_pah:
|
|
||||||
path of the csv file to be read.
|
|
||||||
|
|
||||||
frequency:
|
|
||||||
frequency of the time series data.
|
|
||||||
|
|
||||||
date_format:
|
|
||||||
date format, specified as datetime compatible string
|
|
||||||
|
|
||||||
col_names:
|
|
||||||
specify the column headers to be read.
|
|
||||||
this parameter will allow you to read two columns from a CSV file which may have more columns.
|
|
||||||
this parameter overrides col_index parameter.
|
|
||||||
|
|
||||||
dol_index:
|
|
||||||
specify the column numbers to be read.
|
|
||||||
this parameter will allow you to read two columns from a CSV file which may have more columns.
|
|
||||||
if neither names nor index is specified, the first two columns from the csv file will be read,
|
|
||||||
with the first being treated as date.
|
|
||||||
|
|
||||||
has_header:
|
|
||||||
specify whether the file has a header row.
|
|
||||||
if true, the header row will be ignored while creating the time series data.
|
|
||||||
|
|
||||||
skip_rows:
|
|
||||||
the number of rows after the header which should be skipped.
|
|
||||||
|
|
||||||
nrows:
|
|
||||||
the number of rows to be read from the csv file.
|
|
||||||
|
|
||||||
delimiter:
|
|
||||||
specify the delimeter used in the csv file.
|
|
||||||
|
|
||||||
encoding:
|
|
||||||
specify the encoding of the csv file.
|
|
||||||
|
|
||||||
kwargs:
|
|
||||||
other keyword arguments to be passed on the csv.reader()
|
|
||||||
"""
|
|
||||||
|
|
||||||
data = _preprocess_csv(csv_file_path, delimiter, encoding)
|
data = _preprocess_csv(csv_file_path, delimiter, encoding)
|
||||||
|
|
||||||
|
@ -7,7 +7,7 @@ from typing import Literal
|
|||||||
|
|
||||||
from pyfacts.core import date_parser
|
from pyfacts.core import date_parser
|
||||||
|
|
||||||
from .pyfacts import TimeSeries, create_date_series
|
from .pyfacts import TimeSeries
|
||||||
from .utils import _interval_to_years, _preprocess_from_to_date, covariance
|
from .utils import _interval_to_years, _preprocess_from_to_date, covariance
|
||||||
|
|
||||||
# from dateutil.relativedelta import relativedelta
|
# from dateutil.relativedelta import relativedelta
|
||||||
@ -587,35 +587,3 @@ def sortino_ratio(
|
|||||||
|
|
||||||
sortino_ratio_value = excess_returns / sd
|
sortino_ratio_value = excess_returns / sd
|
||||||
return sortino_ratio_value
|
return sortino_ratio_value
|
||||||
|
|
||||||
|
|
||||||
@date_parser(3, 4)
|
|
||||||
def moving_average(
|
|
||||||
time_series_data: TimeSeries,
|
|
||||||
moving_average_period_unit: Literal["years", "months", "days"],
|
|
||||||
moving_average_period_value: int,
|
|
||||||
from_date: str | datetime.datetime = None,
|
|
||||||
to_date: str | datetime.datetime = None,
|
|
||||||
as_on_match: str = "closest",
|
|
||||||
prior_match: str = "closest",
|
|
||||||
closest: Literal["previous", "next"] = "previous",
|
|
||||||
date_format: str = None,
|
|
||||||
) -> TimeSeries:
|
|
||||||
|
|
||||||
from_date, to_date = _preprocess_from_to_date(
|
|
||||||
from_date,
|
|
||||||
to_date,
|
|
||||||
time_series_data,
|
|
||||||
False,
|
|
||||||
return_period_unit=moving_average_period_unit,
|
|
||||||
return_period_value=moving_average_period_value,
|
|
||||||
as_on_match=as_on_match,
|
|
||||||
prior_match=prior_match,
|
|
||||||
closest=closest,
|
|
||||||
)
|
|
||||||
|
|
||||||
dates = create_date_series(from_date, to_date, time_series_data.frequency.symbol)
|
|
||||||
|
|
||||||
for date in dates:
|
|
||||||
start_date = date - datetime.timedelta(**{moving_average_period_unit: moving_average_period_value})
|
|
||||||
time_series_data[start_date:date]
|
|
||||||
|
@ -1,7 +1,6 @@
|
|||||||
attrs==21.4.0
|
attrs==21.4.0
|
||||||
black==22.1.0
|
black==22.1.0
|
||||||
click==8.1.3
|
click==8.1.3
|
||||||
python-dateutil==2.8.2
|
|
||||||
flake8==4.0.1
|
flake8==4.0.1
|
||||||
iniconfig==1.1.1
|
iniconfig==1.1.1
|
||||||
isort==5.10.1
|
isort==5.10.1
|
||||||
|
3
setup.py
3
setup.py
@ -2,10 +2,9 @@ from setuptools import find_packages, setup
|
|||||||
|
|
||||||
license = open("LICENSE").read().strip()
|
license = open("LICENSE").read().strip()
|
||||||
|
|
||||||
|
|
||||||
setup(
|
setup(
|
||||||
name="pyfacts",
|
name="pyfacts",
|
||||||
version="0.0.1",
|
version=open("VERSION").read().strip(),
|
||||||
license=license,
|
license=license,
|
||||||
author="Gourav Kumar",
|
author="Gourav Kumar",
|
||||||
author_email="gouravkr@outlook.in",
|
author_email="gouravkr@outlook.in",
|
||||||
|
@ -3,11 +3,10 @@ import math
|
|||||||
import random
|
import random
|
||||||
from typing import List
|
from typing import List
|
||||||
|
|
||||||
|
import pyfacts as pft
|
||||||
import pytest
|
import pytest
|
||||||
from dateutil.relativedelta import relativedelta
|
from dateutil.relativedelta import relativedelta
|
||||||
|
|
||||||
import pyfacts as pft
|
|
||||||
|
|
||||||
|
|
||||||
def conf_add(n1, n2):
|
def conf_add(n1, n2):
|
||||||
return n1 + n2
|
return n1 + n2
|
||||||
@ -96,9 +95,7 @@ def sample_data_generator(
|
|||||||
)
|
)
|
||||||
}
|
}
|
||||||
end_date = start_date + relativedelta(**timedelta_dict)
|
end_date = start_date + relativedelta(**timedelta_dict)
|
||||||
dates = pft.create_date_series(
|
dates = pft.create_date_series(start_date, end_date, frequency.symbol, skip_weekends=skip_weekends, eomonth=eomonth)
|
||||||
start_date, end_date, frequency.symbol, skip_weekends=skip_weekends, eomonth=eomonth, ensure_coverage=False
|
|
||||||
)
|
|
||||||
if dates_as_string:
|
if dates_as_string:
|
||||||
dates = [dt.strftime("%Y-%m-%d") for dt in dates]
|
dates = [dt.strftime("%Y-%m-%d") for dt in dates]
|
||||||
values = create_prices(1000, mu, sigma, num)
|
values = create_prices(1000, mu, sigma, num)
|
||||||
|
@ -1,7 +1,6 @@
|
|||||||
import datetime
|
import datetime
|
||||||
|
|
||||||
import pytest
|
import pytest
|
||||||
|
|
||||||
from pyfacts import (
|
from pyfacts import (
|
||||||
AllFrequencies,
|
AllFrequencies,
|
||||||
Frequency,
|
Frequency,
|
||||||
@ -30,7 +29,7 @@ class TestDateSeries:
|
|||||||
def test_monthly(self):
|
def test_monthly(self):
|
||||||
start_date = datetime.datetime(2020, 1, 1)
|
start_date = datetime.datetime(2020, 1, 1)
|
||||||
end_date = datetime.datetime(2020, 12, 31)
|
end_date = datetime.datetime(2020, 12, 31)
|
||||||
d = create_date_series(start_date, end_date, frequency="M", ensure_coverage=False)
|
d = create_date_series(start_date, end_date, frequency="M")
|
||||||
assert len(d) == 12
|
assert len(d) == 12
|
||||||
|
|
||||||
d = create_date_series(start_date, end_date, frequency="M", eomonth=True)
|
d = create_date_series(start_date, end_date, frequency="M", eomonth=True)
|
||||||
@ -327,7 +326,7 @@ class TestExpand:
|
|||||||
ts_data = create_test_data(AllFrequencies.M, num=6)
|
ts_data = create_test_data(AllFrequencies.M, num=6)
|
||||||
ts = TimeSeries(ts_data, "M")
|
ts = TimeSeries(ts_data, "M")
|
||||||
expanded_ts = ts.expand("W", "ffill")
|
expanded_ts = ts.expand("W", "ffill")
|
||||||
assert len(expanded_ts) == 23
|
assert len(expanded_ts) == 22
|
||||||
assert expanded_ts.frequency.name == "weekly"
|
assert expanded_ts.frequency.name == "weekly"
|
||||||
assert expanded_ts.iloc[0][1] == expanded_ts.iloc[1][1]
|
assert expanded_ts.iloc[0][1] == expanded_ts.iloc[1][1]
|
||||||
|
|
||||||
@ -341,23 +340,8 @@ class TestExpand:
|
|||||||
|
|
||||||
|
|
||||||
class TestShrink:
|
class TestShrink:
|
||||||
def test_daily_to_smaller(self, create_test_data):
|
# TODO
|
||||||
ts_data = create_test_data(AllFrequencies.D, num=1000)
|
pass
|
||||||
ts = TimeSeries(ts_data, "D")
|
|
||||||
shrunk_ts_w = ts.shrink("W", "ffill")
|
|
||||||
shrunk_ts_m = ts.shrink("M", "ffill")
|
|
||||||
assert len(shrunk_ts_w) == 144
|
|
||||||
assert len(shrunk_ts_m) == 34
|
|
||||||
|
|
||||||
def test_weekly_to_smaller(self, create_test_data):
|
|
||||||
ts_data = create_test_data(AllFrequencies.W, num=300)
|
|
||||||
ts = TimeSeries(ts_data, "W")
|
|
||||||
tsm = ts.shrink("M", "ffill")
|
|
||||||
assert len(tsm) == 70
|
|
||||||
tsmeo = ts.shrink("M", "ffill", eomonth=True)
|
|
||||||
assert len(tsmeo) == 69
|
|
||||||
with pytest.raises(ValueError):
|
|
||||||
ts.shrink("D", "ffill")
|
|
||||||
|
|
||||||
|
|
||||||
class TestMeanReturns:
|
class TestMeanReturns:
|
||||||
@ -374,29 +358,29 @@ class TestTransform:
|
|||||||
def test_daily_to_weekly(self, create_test_data):
|
def test_daily_to_weekly(self, create_test_data):
|
||||||
ts_data = create_test_data(AllFrequencies.D, num=782, skip_weekends=True)
|
ts_data = create_test_data(AllFrequencies.D, num=782, skip_weekends=True)
|
||||||
ts = TimeSeries(ts_data, "D")
|
ts = TimeSeries(ts_data, "D")
|
||||||
tst = ts.transform("W", "mean", ensure_coverage=False)
|
tst = ts.transform("W", "mean")
|
||||||
assert isinstance(tst, TimeSeries)
|
assert isinstance(tst, TimeSeries)
|
||||||
assert len(tst) == 157
|
assert len(tst) == 157
|
||||||
assert "2017-01-30" in tst
|
assert "2017-01-30" in tst
|
||||||
assert tst.iloc[4] == (datetime.datetime(2017, 1, 30), 1020.082)
|
assert tst.iloc[4] == (datetime.datetime(2017, 1, 30), 1021.19)
|
||||||
|
|
||||||
def test_daily_to_monthly(self, create_test_data):
|
def test_daily_to_monthly(self, create_test_data):
|
||||||
ts_data = create_test_data(AllFrequencies.D, num=782, skip_weekends=False)
|
ts_data = create_test_data(AllFrequencies.D, num=782, skip_weekends=False)
|
||||||
ts = TimeSeries(ts_data, "D")
|
ts = TimeSeries(ts_data, "D")
|
||||||
tst = ts.transform("M", "mean")
|
tst = ts.transform("M", "mean")
|
||||||
assert isinstance(tst, TimeSeries)
|
assert isinstance(tst, TimeSeries)
|
||||||
assert len(tst) == 27
|
assert len(tst) == 26
|
||||||
assert "2018-01-01" in tst
|
assert "2018-01-01" in tst
|
||||||
assert round(tst.iloc[12][1], 2) == 1146.91
|
assert round(tst.iloc[12][1], 2) == 1146.1
|
||||||
|
|
||||||
def test_daily_to_yearly(self, create_test_data):
|
def test_daily_to_yearly(self, create_test_data):
|
||||||
ts_data = create_test_data(AllFrequencies.D, num=782, skip_weekends=True)
|
ts_data = create_test_data(AllFrequencies.D, num=782, skip_weekends=True)
|
||||||
ts = TimeSeries(ts_data, "D")
|
ts = TimeSeries(ts_data, "D")
|
||||||
tst = ts.transform("Y", "mean")
|
tst = ts.transform("Y", "mean")
|
||||||
assert isinstance(tst, TimeSeries)
|
assert isinstance(tst, TimeSeries)
|
||||||
assert len(tst) == 4
|
assert len(tst) == 3
|
||||||
assert "2019-01-02" in tst
|
assert "2019-01-02" in tst
|
||||||
assert tst.iloc[2] == (datetime.datetime(2019, 1, 2), 1157.2835632183908)
|
assert tst.iloc[2] == (datetime.datetime(2019, 1, 2), 1238.5195)
|
||||||
|
|
||||||
def test_weekly_to_monthly(self, create_test_data):
|
def test_weekly_to_monthly(self, create_test_data):
|
||||||
ts_data = create_test_data(AllFrequencies.W, num=261)
|
ts_data = create_test_data(AllFrequencies.W, num=261)
|
||||||
@ -404,22 +388,22 @@ class TestTransform:
|
|||||||
tst = ts.transform("M", "mean")
|
tst = ts.transform("M", "mean")
|
||||||
assert isinstance(tst, TimeSeries)
|
assert isinstance(tst, TimeSeries)
|
||||||
assert "2017-01-01" in tst
|
assert "2017-01-01" in tst
|
||||||
assert tst.iloc[1] == (datetime.datetime(2017, 2, 1), 1008.405)
|
assert tst.iloc[0] == (datetime.datetime(2017, 1, 1), 1007.33)
|
||||||
|
|
||||||
def test_weekly_to_qty(self, create_test_data):
|
def test_weekly_to_qty(self, create_test_data):
|
||||||
ts_data = create_test_data(AllFrequencies.W, num=261)
|
ts_data = create_test_data(AllFrequencies.W, num=261)
|
||||||
ts = TimeSeries(ts_data, "W")
|
ts = TimeSeries(ts_data, "W")
|
||||||
tst = ts.transform("Q", "mean")
|
tst = ts.transform("Q", "mean")
|
||||||
assert len(tst) == 21
|
assert len(tst) == 20
|
||||||
assert "2018-01-01" in tst
|
assert "2018-01-01" in tst
|
||||||
assert round(tst.iloc[4][1], 2) == 1032.01
|
assert round(tst.iloc[4][1], 2) == 1054.72
|
||||||
|
|
||||||
def test_weekly_to_yearly(self, create_test_data):
|
def test_weekly_to_yearly(self, create_test_data):
|
||||||
ts_data = create_test_data(AllFrequencies.W, num=261)
|
ts_data = create_test_data(AllFrequencies.W, num=261)
|
||||||
ts = TimeSeries(ts_data, "W")
|
ts = TimeSeries(ts_data, "W")
|
||||||
tst = ts.transform("Y", "mean")
|
tst = ts.transform("Y", "mean")
|
||||||
assert "2019-01-01" in tst
|
assert "2019-01-01" in tst
|
||||||
assert round(tst.iloc[2][1], 2) == 1053.70
|
assert round(tst.iloc[2][1], 2) == 1054.50
|
||||||
with pytest.raises(ValueError):
|
with pytest.raises(ValueError):
|
||||||
ts.transform("D", "mean")
|
ts.transform("D", "mean")
|
||||||
|
|
||||||
@ -427,9 +411,9 @@ class TestTransform:
|
|||||||
ts_data = create_test_data(AllFrequencies.M, num=36)
|
ts_data = create_test_data(AllFrequencies.M, num=36)
|
||||||
ts = TimeSeries(ts_data, "M")
|
ts = TimeSeries(ts_data, "M")
|
||||||
tst = ts.transform("Q", "mean")
|
tst = ts.transform("Q", "mean")
|
||||||
assert len(tst) == 13
|
assert len(tst) == 12
|
||||||
assert "2018-10-01" in tst
|
assert "2018-10-01" in tst
|
||||||
assert tst.iloc[7] == (datetime.datetime(2018, 10, 1), 1022.6466666666666)
|
assert tst.iloc[7] == (datetime.datetime(2018, 10, 1), 1021.19)
|
||||||
with pytest.raises(ValueError):
|
with pytest.raises(ValueError):
|
||||||
ts.transform("M", "sum")
|
ts.transform("M", "sum")
|
||||||
|
|
||||||
|
7
tox.ini
7
tox.ini
@ -1,11 +1,10 @@
|
|||||||
[tox]
|
[tox]
|
||||||
minversion = 3.8.10
|
minversion = 3.8.0
|
||||||
envlist = py38,py39,py310,py311,py312,py313
|
envlist = py38,py39,py310
|
||||||
|
|
||||||
[testenv]
|
[testenv]
|
||||||
deps = pytest
|
deps = pytest
|
||||||
python-dateutil
|
commands = pytest
|
||||||
commands = pytest tests
|
|
||||||
|
|
||||||
[flake8]
|
[flake8]
|
||||||
max-line-length=125
|
max-line-length=125
|
Loading…
Reference in New Issue
Block a user