pyspark.pandas.date_range¶
-
pyspark.pandas.
date_range
(start: Union[str, Any] = None, end: Union[str, Any] = None, periods: Optional[int] = None, freq: Union[str, pandas._libs.tslibs.offsets.DateOffset, None] = None, tz: Union[str, datetime.tzinfo, None] = None, normalize: bool = False, name: Optional[str] = None, closed: Optional[str] = None, **kwargs: Any) → pyspark.pandas.indexes.datetimes.DatetimeIndex[source]¶ Return a fixed frequency DatetimeIndex.
- Parameters
- startstr or datetime-like, optional
Left bound for generating dates.
- endstr or datetime-like, optional
Right bound for generating dates.
- periodsint, optional
Number of periods to generate.
- freqstr or DateOffset, default ‘D’
Frequency strings can have multiples, e.g. ‘5H’.
- tzstr or tzinfo, optional
Time zone name for returning localized DatetimeIndex, for example ‘Asia/Hong_Kong’. By default, the resulting DatetimeIndex is time zone naive.
- normalizebool, default False
Normalize start/end dates to midnight before generating date range.
- namestr, default None
Name of the resulting DatetimeIndex.
- closed{None, ‘left’, ‘right’}, optional
Make the interval closed with respect to the given frequency to the ‘left’, ‘right’, or both sides (None, the default).
Deprecated since version 3.4.0.
- **kwargs
For compatibility. Has no effect on the result.
- Returns
- rngDatetimeIndex
See also
DatetimeIndex
An immutable container for datetimes.
Notes
Of the four parameters
start
,end
,periods
, andfreq
, exactly three must be specified. Iffreq
is omitted, the resultingDatetimeIndex
will haveperiods
linearly spaced elements betweenstart
andend
(closed on both sides).To learn more about the frequency strings, please see this link.
Examples
Specifying the values
The next four examples generate the same DatetimeIndex, but vary the combination of start, end and periods.
Specify start and end, with the default daily frequency.
>>> ps.date_range(start='1/1/2018', end='1/08/2018') DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04', '2018-01-05', '2018-01-06', '2018-01-07', '2018-01-08'], dtype='datetime64[ns]', freq=None)
Specify start and periods, the number of periods (days).
>>> ps.date_range(start='1/1/2018', periods=8) DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04', '2018-01-05', '2018-01-06', '2018-01-07', '2018-01-08'], dtype='datetime64[ns]', freq=None)
Specify end and periods, the number of periods (days).
>>> ps.date_range(end='1/1/2018', periods=8) DatetimeIndex(['2017-12-25', '2017-12-26', '2017-12-27', '2017-12-28', '2017-12-29', '2017-12-30', '2017-12-31', '2018-01-01'], dtype='datetime64[ns]', freq=None)
Specify start, end, and periods; the frequency is generated automatically (linearly spaced).
>>> ps.date_range( ... start='2018-04-24', end='2018-04-27', periods=3 ... ) DatetimeIndex(['2018-04-24 00:00:00', '2018-04-25 12:00:00', '2018-04-27 00:00:00'], dtype='datetime64[ns]', freq=None)
Other Parameters
Changed the freq (frequency) to
'M'
(month end frequency).>>> ps.date_range(start='1/1/2018', periods=5, freq='M') DatetimeIndex(['2018-01-31', '2018-02-28', '2018-03-31', '2018-04-30', '2018-05-31'], dtype='datetime64[ns]', freq=None)
Multiples are allowed
>>> ps.date_range(start='1/1/2018', periods=5, freq='3M') DatetimeIndex(['2018-01-31', '2018-04-30', '2018-07-31', '2018-10-31', '2019-01-31'], dtype='datetime64[ns]', freq=None)
freq can also be specified as an Offset object.
>>> ps.date_range( ... start='1/1/2018', periods=5, freq=pd.offsets.MonthEnd(3) ... ) DatetimeIndex(['2018-01-31', '2018-04-30', '2018-07-31', '2018-10-31', '2019-01-31'], dtype='datetime64[ns]', freq=None)
closed controls whether to include start and end that are on the boundary. The default includes boundary points on either end.
>>> ps.date_range( ... start='2017-01-01', end='2017-01-04', closed=None ... ) DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03', '2017-01-04'], dtype='datetime64[ns]', freq=None)
Use
closed='left'
to exclude end if it falls on the boundary.>>> ps.date_range( ... start='2017-01-01', end='2017-01-04', closed='left' ... ) DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03'], dtype='datetime64[ns]', freq=None)
Use
closed='right'
to exclude start if it falls on the boundary.>>> ps.date_range( ... start='2017-01-01', end='2017-01-04', closed='right' ... ) DatetimeIndex(['2017-01-02', '2017-01-03', '2017-01-04'], dtype='datetime64[ns]', freq=None)