`pya`

Collection of classes and functions for processing audio signals in python and jupyter notebooks, for synthesis, effects, analysis and plotting.

Subpackages

Submodules

Package Contents

Classes

`Asig`
`Astft`	Audio spectrogram (STFT) class, attributes refers to scipy.signal.stft. With an addition
`Aspec`
`Amfcc`	Mel filtered Fourier spectrum (MFCC) class,
`Aserver`
`Arecorder`	pya audio recorder
`Ugen`	Unit Generator for to create Asig with predefined signal
`DummyBackend`	Helper class that provides a standard way to create an ABC using
`PyAudioBackend`	Helper class that provides a standard way to create an ABC using

Functions

`basicplot`(data, ticks, channels[, offset, scale, cn, ...])	Basic version of the plot for pya, this can be directly used
`gridplot`(pya_objects[, colwrap, cbar_ratio, figsize])	Create a grid plot of pya objects which have plot() methods,
`audio_read`(fp)
`linlin`(x, smi, sma, dmi, dma)	Linear mapping
`midicps`(m)	Convert midi number into cycle per second
`cpsmidi`(c)	Convert cycle per second into midi number
`dbamp`(db)	Convert db to amplitude
`ampdb`(amp)	Convert amplitude to db
`spectrum`(sig, samples, channels, sr)	Return spectrum of a given signal. This method return spectrum matrix if input signal is multi-channels.
`normalize`(d)	Return the normalized input array
`audio_from_file`(path[, dtype])	Load an audio buffer using audioread.
`buf_to_float`(x[, n_bytes, dtype])	Convert an integer buffer to floating point values.
`device_info`()	Return a formatted string about available audio devices and their info
`find_device`([min_input, min_output])
`padding`(x, width[, tail, constant_values])	Pad signal with certain width, support 1-3D tensors.
`is_pow2`(val)	Check if input is a power of 2 return a bool result.
`next_pow2`(x)	Find the closest pow of 2 that is great or equal or x,
`round_half_up`(number)	Round up if >= .5
`rolling_window`(a, window[, step])
`signal_to_frame`(sig, n_per_frame, frame_step[, ...])	Frame a signal into overlapping frames.
`magspec`(frames, NFFT)	Compute the magnitude spectrum of each frame in frames.
`powspec`(frames, NFFT)	Compute the power spectrum of each frame in frames,
`hz2mel`(hz)	Convert a value in Hertz to Mels
`mel2hz`(mel)	Convert a value in Hertz to Mels
`determine_backend`([force_webaudio, port])
`startup`(**kwargs)
`shutdown`(**kwargs)

class pya.Asig(sig, sr=44100, label='', channels=1, cn=None)

Audio signal class. Asig enables manipulation of audio signals in the style of numpy and more. Asig offer functions for plotting (via matplotlib) and playing audio (using the pya.Aserver class)

sig

Array for the audio signal. Can be mono or multichannel.

Type:: numpy.array

sr

Sampling rate

Type:: int

label

A string label to give the object a unique identifier.

Type:: str

channels

Number of channels

Type:: int

cn

cn short for channel names is a list of string of size channels, to give each channel a unique name. channel names can be used to subset signal channels in a more readible way, e.g. asig[:, [‘left’, ‘front’]] subsets the left and front channels of the signal.

Type:: list of str, None

mix_mode

used to extend numpy __setitem__() operation to frequent audio manipulations such as mixing, extending, boundary, replacing. Current Asig supports the mix_modes: bound, extend, overwrite. mix_mode should not be set directly but is set temporarilty when using the .bound, .extend and .overwrite properties.

Type:: str or None

property channels: Return the number of channels

property samples: Return the length of signal in samples

property cn: Channel names getter

property x

this mode allows destination sig size in assignment to be extended through setitem

Type:: Extend mode

property b

this mode allows to truncate a source signal in assignment to a limited destination in setitem.

Type:: Bound mode

property o

this mode cuts and replaces target selection by source signal on assignment via setitem

Type:: Overwrite mode

extend

bound

overwrite

_load_audio_file(fname)

Load audio file, and set self.sig to the signal and self.sr to the sampling rate. Currently support two types of audio loader: 1) Standard library for .wav, .aiff, and ffmpeg for other such as .mp3.

Parameters:: fname (str) – Path to file.

save_wavfile(fname='asig.wav', dtype='float32')

Save signal as .wav file, return self.

Parameters:

fname (str) – name of the file with .wav (Default value = “asig.wav”)
dtype (str) – datatype (Default value = ‘float32’)

_set_col_names()

__getitem__(index)

Accessing array elements through slicing.

int, get signal row asig[4];
slice, range and step slicing asig[4:40:2] # from 4 to 40 every 2 samples;
list, subset rows, asig[[2, 4, 6]] # pick out index 2, 4, 6 as a new asig
tuple, row and column specific slicing, asig[4:40, 3:5] # from 4 to 40, channel 3 and 4
Time slicing (unit in seconds) using dict asig[{1:2.5}, :] creates indexing of 1s to 2.5s.
Channel name slicing: asig[‘l’] returns channel ‘l’ as a new mono asig. asig[[‘front’, ‘rear’]], etc…
bool, subset channels: asig[:, [True, False]]

Parameters:: index (Number or slice or list or tuple or dict) – Slicing argument.
Returns:: a – __getitem__ returns a subset of the self based on the slicing.
Return type:: Asig

__setitem__(index, value)

setitem: asig[index] = value. This allows all the methods from getitem:

numpy style slicing
string/string_list slicing for subsetting channels based on channel name self.cn
time slicing (unit seconds) via dict.
bool slicing to filter out specific channels.

In addition, there are 4 possible modes: (referring to asig as ‘dest’, and value as ‘src’

standard pythonic way that the src und dest dimensions need to match
asig[…] = value
bound mode where src is copied up to the bounds of dest
asig.b[…] = value
extend mode where dest is dynamically extended to make space for src
asig.x[…] = value
overwrite mode where selected dest subset is replaced by specified src regardless the length.
asig.o[…] = value

row index:

list: e.g. [1,2,3,4,5,6,7,8] or [True, …, False] (modes b and x possible)
int: e.g. 0 (i.e. a single sample, so no need for extra modes)
slice: e.g. 100:5000:2 (can be used with all modes)
dict: e.g. {0.5: 2.5} (modes o, b possible, x only if step==1, or if step==None and stop=None)

Parameters:

index (Number or slice or list or tuple or dict) – Slicing argument.
value (Asig or numpy.ndarray or list) – value to set

Returns:

_ – Updated asig

Return type:

Asig

resample(target_sr=44100, rate=1, kind='linear')

Resample signal based on interpolation, can process multichannel signals.

Parameters:

target_sr (int) – Target sampling rate (Default value = 44100)
rate (float) – Rate to speed up or slow down the audio (Default value = 1)
kind (str) – Type of interpolation (Default value = ‘linear’)

Returns:

_ – Asig with resampled signal.

Return type:

Asig

play(rate=1, **kwargs)

Play Asig audio via Aserver, using Aserver.default (if existing) kwargs are propagated to Aserver:play(onset=0, out=0)

Parameters:

rate (float) – Playback rate (Default value = 1)
**kwargs (str) –

‘server’Aserver
Set which server to play. e.g. s = Aserver(); s.boot(); asig.play(server=s)

Returns:

_ – return self

Return type:

Asig

shift_channel(shift=0)

Shift signal to other channels. This is particular useful for assigning a mono signal to a specific channel.

shift = 0: does nothing as the same signal is being routed to the same position
shift > 0: shift channels of self.sig ‘right’, i.e. from [0,..channels-1] to channels [shift,shift+1,…]
shift < 0: shift channels of self.sig ‘left’, i.e. the first shift channels will be discarded.

Parameters:: shift (int) – shift channel amount (Default value = 0)
Returns:: _ – Rerouted asig
Return type:: Asig

mono(blend=None)

Mix channels to mono signal. Perform sig = np.sum(self.sig_copy * blend, axis=1)

Parameters:: blend (list) – list of gain for each channel as a multiplier. Do nothing if signal is already mono, raise warning (Default value = None)
Returns:: _ – A mono Asig object
Return type:: Asig

stereo(blend=None)

Blend all channels of the signal to stereo. Applicable for any single-/ or multi-channel Asig.

Parameters:: blend (list or None) – Usage: For mono signal, blend=(g1, g2), the mono channel will be broadcated to left, right with g1, g2 gains. For stereo signal, blend=(g1, g2), each channel is gain adjusted by g1, g2. For multichannel: blend = [[list of gains for left channel], [list of gains for right channel]] Default value = None, resulting in equal distribution to left and right channel

Example

asig[:,[‘c1’,’c2’,’c3’]].stereo[[1, 0.707, 0], [0, 0.707, 1]]: mixes channel ‘c1’ to left, ‘c2’ to center and ‘c3’ to right channel of a new stereo asig. Note that for equal loudness left**2+right**2=1 should be used

Returns:: _ – A stereo Asig object
Return type:: Asig

rewire(dic)

Rewire channels to flexibly allow weighted channel permutations.

Parameters:: dic (dict) – key = tuple of (source channel, destination channel) value = amplitude gain

Example

{(0, 1): 0.2, (5, 0): 0.4}: rewire channel 0 to 1 with gain 0.2, and 5 to 1 with gain 2 leaving other channels unmodified

Returns:: _ – Asig with rewired channels..
Return type:: Asig

pan2(pan=0.0)

Stereo panning of asig to a stereo output. Panning is based on constant power panning, see pan below Behavior depends on nr of channels self.channels * multi-channel signals (self.channels>2) are cut back to stereo and treated as * stereo signals (self.channels==2) are channelwise attenuated using cos(angle), sin(angle) * mono signals (self.channels==1) result in stereo output asigs.

Parameters:: pan (float) – panning between -1. (left) to 1. (right) (Default value = 0.)
Returns:: _ – Asig
Return type:: Asig

remove_DC()

remove DC offset

Parameters:: none –
Returns:: _ – channelwise DC-free Asig.
Return type:: Asig

norm(norm=1, in_db=False, dcflag=False)

Normalize signal

Parameters:

norm (float) – normalize threshold (Default value = 1)
in_db (bool) – Normally, norm takes amplitude, if in_db, norm’s unit is in dB.
dcflag (bool) – If true, remove DC offset (Default value = False)

Returns:

_ – normalized Asig.

Return type:

Asig

gain(amp=None, db=None)

Apply gain in amplitude or dB, only use one or the other arguments. Argument can be either a scalar or a list (to apply individual gain to each channel). The method returns a new asig with gain applied.

Parameters:

amp (float or None) – Amplitude (Default value = None)
db (float or int or None) – Decibel (Default value = None)

Returns:

_ – Gain adjusted Asig.

Return type:

Asig

rms(axis=0)

Return signal’s RMS

Parameters:: axis (int) – Axis to perform np.mean() on (Default value = 0)
Returns:: _ – RMS value
Return type:: float

plot(fn=None, offset=0, scale=1, x_as_time=True, ax=None, xlim=None, ylim=None, **kwargs)

Display signal graph

Parameters:

fn (func or None) – Keyword or function (Default value = None)
offset (int or float) – Offset each channel to create a stacked view (Default value = 0)
scale (float) – Scale the y value (Default value = 1)
xlim (tuple or list) – x axis range (Default value = None)
ylim (tuple or list) – y axis range (Default value = None)
**kwargs – keyword arguments for matplotlib.pyplot.plot()

Returns:

_ – self, you can use plt.show() to display the plot.

Return type:

Asig

get_duration(): Return the duration in second.

get_times(): Get time stamps for left-edge of sample-and-hold-signal

__eq__(other): Check if two asig objects have the same signal. But does not care about sr and others

__repr__(): Report key attributes

__mul__(other): Magic method for multiplying. You can either multiply a scalar or an Asig object. If muliplying an Asig, you don’t always need to have same size arrays as audio signals may different in length. If mix_mode is set to ‘bound’ the size is fixed to respect self. If not, the result will respect to whichever the bigger array is.

__rmul__(other)

__truediv__(other)

Magic method for division. You can either divide a scalar or an Asig object. Use division with caution, audio signal is common to reach 0 or near, avoid zero division or extremely large result.

If dividing an Asig, you don’t always need to have same size arrays as audio signals may different in length. If mix_mode is set to ‘bound’ the size is fixed to respect self. If not, the result will respect to whichever the bigger array is.

__rtruediv__(other)

__add__(other): Magic method for adding. You can either add a scalar or an Asig object. If adding an Asig, you don’t always need to have same size arrays as audio signals may different in length. If mix_mode is set to ‘bound’ the size is fixed to respect self. If not, the result will respect to whichever the bigger array is.

__radd__(other)

__sub__(other): Magic method for subtraction. You can either minus a scalar or an Asig object. If subtracting an Asig, you don’t always need to have same size arrays as audio signals may different in length. If mix_mode is set to ‘bound’ the size is fixed to respect self. If not, the result will respect to whichever the bigger array is.

__rsub__(other)

find_events(step_dur=0.001, sil_thr=-20, evt_min_dur=0, sil_min_dur=0.1, sil_pad=[0.001, 0.1])

Locate meaningful ‘events’ in the signal and create event list. Onset detection.

Parameters:

step_dur (float) – duration in seconds of each search step (Default value = 0.001)
sil_thr (int) – silent threshold in dB (Default value = -20)
evt_min_dur (float) – minimum duration to be counted as an event (Default value = 0)
sil_min_dur (float) – minimum duration to be counted as silent (Default value = 0.1)
sil_pad (list) – this allows you to add a small duration before and after the actual found event locations to the event ranges. If it is a list, you can set the padding (Default value = [0.001)
0.1] –

Returns:

_ – This method returns self. But the list of events can be accessed through self._[‘events’]

Return type:

Asig

select_event(index=None, onset=None)

This method can be called after find_event (aka onset detection).

Parameters:

index (int or None) – Index of the event (Default value = None)
onset (int or None) – Onset of the event (Default value = None)

Returns:

_ – self

Return type:

Asig

plot_events()

fade_in(dur=0.1, curve=1)

Fade in the signal at the beginning

Parameters:

dur (float) – Duration in seconds to fade in (Default value = 0.1)
curve (float) – Curvature of the fader, power of the linspace function. (Default value = 1)

Returns:

_ – Asig, new asig with the fade in signal

Return type:

Asig

fade_out(dur=0.1, curve=1)

Fade out the signal at the end

Parameters:

dur (float) – duration in seconds to fade out (Default value = 0.1)
curve (float) – Curvature of the fader, power of the linspace function. (Default value = 1)

Returns:

_ – Asig, new asig with the fade out signal

Return type:

Asig

iirfilter(cutoff_freqs, btype='bandpass', ftype='butter', order=4, filter='lfilter', rp=None, rs=None)

iirfilter based on scipy.signal.iirfilter

Parameters:

cutoff_freqs (float or [float, float]) – Cutoff frequency or frequencies.
btype (str) – Filter type (Default value = ‘bandpass’)
ftype (str) – Tthe type of IIR filter. e.g. ‘butter’, ‘cheby1’, ‘cheby2’, ‘elip’, ‘bessel’ (Default value = ‘butter’)
order (int) – Filter order (Default value = 4)
filter (str) – The scipy.signal method to call when applying the filter coeffs to the signal. by default it is set to scipy.signal.lfilter (one-dimensional).
rp (float) – For Chebyshev and elliptic filters, provides the maximum ripple in the passband. (dB) (Default value = None)
rs (float) – For Chebyshev and elliptic filters, provides the minimum attenuation in the stop band. (dB) (Default value = None)

Returns:

_ – new Asig with the filter applied. also you can access b, a coefficients by doing self._[‘b’] and self._[‘a’]

Return type:

Asig

plot_freqz(worN, **kwargs)

Plot the frequency response of a digital filter. Perform scipy.signal.freqz then plot the response.

TODO :param worN: :param **kwargs:

envelope(amps, ts=None, curve=1, kind='linear')

Create an envelop and multiply by the signal.

Parameters:

amps (array) – Amplitude of each breaking point
ts (array) – Indices of each breaking point (Default value = None)
curve (int) – Affecting the curvature of the ramp. (Default value = 1)
kind (str) – The type of interpolation (Default value = ‘linear’)

Returns:

_ – Returns a new asig with the enveloped applied to its signal array

Return type:

Asig

adsr(att=0, dec=0.1, sus=0.7, rel=0.1, curve=1, kind='linear')

Create and applied a ADSR evelope to signal.

Parameters:

att (float) – attack (Default value = 0)
dec (float) – decay (Default value = 0.1)
sus (float) – sustain (Default value = 0.7)
rel (float) – release. (Default value = 0.1)
curve (int) – affecting the curvature of the ramp. (Default value = 1)
kind (str) – The type of interpolation (Default value = ‘linear’)

Returns:

_ – returns a new asig with the enveloped applied to its signal array

Return type:

Asig

window(win='triang', **kwargs)

Apply windowing to self.sig

Parameters:

win (str) – Type of window check scipy.signal.get_window for avaiable types. (Default value = ‘triang’)
**kwargs – keyword arguments for scipy.signal.get_window()

Returns:

_ – new asig with window applied.

Return type:

Asig

window_op(nperseg=64, stride=32, win=None, fn='rms', pad='mirror')

TODO add docstring

Parameters:

nperseg – (Default value = 64)
stride – (Default value = 32)
win – (Default value = None)
fn – (Default value = ‘rms’)
pad – (Default value = ‘mirror’)

overlap_add(nperseg=64, stride_in=32, stride_out=32, jitter_in=None, jitter_out=None, win=None, pad='mirror')

TODO

Parameters:

nperseg – (Default value = 64)
stride_in – (Default value = 32)
stride_out – (Default value = 32)
jitter_in – (Default value = None)
jitter_out – (Default value = None)
win – (Default value = None)
pad – (Default value = ‘mirror’)

to_spec(): Return Aspec object which is the rfft of the signal.

to_stft(**kwargs): Return Astft object which is the stft of the signal. Keyword arguments are the arguments for scipy.signal.stft().

to_mfcc(n_per_frame=None, hopsize=None, nfft=None, window='hann', nfilters=26, ncep=13, ceplifter=22, preemph=0.95, append_energy=True): Return Amfcc object.

plot_spectrum(offset=0, scale=1.0, xlim=None, **kwargs)

Plot spectrum of the signal

Parameters:

offset (float) – If self.sig is multichannels, this will offset each channels to create a stacked view for better viewing (Default value = 0.)
scale (float) – scale the y_axis (Default value = 1.)
xlim (tuple) – range of x_axis (Default value = None)
**kwargs – keywords arguments for matplotlib.pyplot.plot()

Returns:

_ – self

Return type:

Asig

spectrogram(*argv, **kvarg): Perform sicpy.signal.spectrogram and returns: frequencies, array of times, spectrogram

get_size(): Return signal array shape and duration in seconds.

append(asig, amp=1)

Apppend an asig with another. Conditions: the appended asig should have the same channels. If appended asig has a different sampling rate, resample it to match the orginal.

Parameters:

asig (Asig) – object to append
amp (float or int) – aplitude (Default value = 1)

Returns:

_ – Appended Asig object

Return type:

Asig

add(sig, pos=None, amp=1, onset=None)

Add a signal

Parameters:

sig (asig) – Signal to add
pos (int, None) – Postion to add (Default value = None)
amp (float) – Aplitude (Default value = 1)
onset (float or None) – Similar to pos but in time rather sample, given a value to this will overwrite pos (Default value = None)

Returns:

_ – Asig with the added signal.

Return type:

Asig

flatten(): Flatten a multidimentional array into a vector using np.ravel()

pad(width, tail=True, constant_values=0)

Pads the signal

Parameters:

width (int) – The number of sampels to add to the tail or head of the array.
tail (bool) – By default it is True, if trail pad to the end, else pad to the start.

Returns:

_ – Asig of the pad signal.

Return type:

Asig

custom(func, **kwargs): custom function method. TODO add example

class pya.Astft(x, sr=None, label=None, window='hann', nperseg=256, noverlap=None, nfft=None, detrend=False, return_onesided=True, boundary='zeros', padded=True, cn=None)

Audio spectrogram (STFT) class, attributes refers to scipy.signal.stft. With an addition attribute cn being the list of channel names, and label being the name of the Asig

to_sig(**kwargs)

Create signal from stft, i.e. perform istft, kwargs overwrite Astft values for istft

Parameters:

**kwargs (str) –

optional keyboard arguments used in istft:: ’sr’, ‘window’, ‘nperseg’, ‘noverlap’, ‘nfft’, ‘input_onesided’, ‘boundary’.

also convert ‘sr’ to ‘fs’ since scipy uses ‘fs’ as sampling frequency.

Returns:

_ – Asig

Return type:

Asig

plot(fn=lambda x: ..., ax=None, offset=0, scale=1.0, xlim=None, ylim=None, show_bar=True, **kwargs)

Plot spectrogram

Parameters:

fn (func) – a function, by default is bypass
ch (int or str or None) – By default it is None,
ax (matplotlib.axes) – you can assign your plot to specific axes (Default value = None)
xlim (tuple or list) – x_axis range (Default value = None)
ylim (tuple or list) – y_axis range (Default value = None)
**kwargs – keyward arguments of matplotlib’s pcolormesh

Returns:

_ – self

Return type:

Asig

__repr__(): Return repr(self).

class pya.Aspec(x, sr=44100, label=None, cn=None)

Audio spectrum class using rfft

get_duration(): Return the duration in second.

to_sig(): Convert Aspec into Asig

weight(weights, freqs=None, curve=1, kind='linear')

TODO

Parameters:

weights –
freqs – (Default value = None)
curve – (Default value = 1)
kind – (Default value = ‘linear’)

plot(fn=np.abs, ax=None, offset=0, scale=1, xlim=None, ylim=None, **kwargs)

Plot spectrum

Parameters:

fn (func) – function for processing the rfft spectrum. (Default value = np.abs)
x_as_time (bool, optional) – By default x axis display the time, if faulse display samples
xlim (tuple or list or None) – Set x axis range (Default value = None)
ylim (tuple or list or None) – Set y axis range (Default value = None)
offset (int or float) – This is the absolute value each plot is shift vertically to each other.
scale (float) – Scaling factor of the plot, use in multichannel plotting.
**kwargs – Keyword arguments for matplotlib.pyplot.plot()

Returns:

_ – self

Return type:

Asig

__repr__(): Return repr(self).

class pya.Amfcc(x, sr=None, label='', n_per_frame=None, hopsize=None, nfft=None, window='hann', nfilters=26, ncep=13, ceplifter=22, preemph=0.95, append_energy=True, cn=None)

Mel filtered Fourier spectrum (MFCC) class, this class is inspired by jameslyons/python_speech_features, https://github.com/jameslyons/python_speech_features Steps of mfcc:

Frame the signal into short frames.

For each frame calculate the periodogram estimate of the

power spectrum. * Apply the mel filterbank to the power spectra, sum the energy in each filter. * Take the DCT of the log filterbank energies. * Keep DCT coefficients 2-13, discard the rest. * Take the logarithm of all filterbank energies.

x

x can be two forms, the most commonly used is an Asig object. Such as directly acquired from an Asig object via Asig.to_stft().

Type:: Asig or numpy.ndarray

sr

sampling rate, this is only necessary if x is not Asig.

Type:: int

duration

Duration of the signal in second,

Type:: float

label

A string label as an identifier.

Type:: str

n_per_frame

Number of samples per frame

Type:: int

hopsize

Number of samples of each successive frame.

Type:: int

nfft

FFT size, default to be next power of 2 integer of n_per_frame

Type:: int

window

Type of the window function (Default value=’hann’), use scipy.signal.get_window to return a numpy array. If None, no windowing will be applied.

Type:: str

nfilters

The number of mel filters. Default is 26

Type:: int

ncep

Number of cepstrum. Default is 13

Type:: int

cepliter

Lifter’s cepstral coefficient. Default is 22

Type:: int

frames

The original signal being reshape into frame based on n_per_frame and hopsize.

Type:: numpy.ndarray

frame_energy

Total power spectrum energy of each frame.

Type:: numpy.ndarray

filter_banks

An array of mel filters

Type:: numpy.ndarray

cepstra

An array of the MFCC coeffcient, size: nframes x ncep

Type:: numpy.ndarray

property nframes

property timestamp

property features: The features refer to the cepstra

__repr__(): Return repr(self).

static preemphasis(x, coeff=0.97)

Pre-emphasis filter to whiten the spectrum. Pre-emphasis is a way of compensating for the rapid decaying spectrum of speech. Can often skip this step in the cases of music for example

Parameters:

x (numpy.ndarray) – Signal array
coeff (float) – Preemphasis coefficient. The larger the stronger smoothing and the slower response to change.

Returns:

_ – The whitened signal.

Return type:

numpy.ndarray

static mel_filterbanks(sr, nfilters=26, nfft=512, lowfreq=0, highfreq=None)

Compute a Mel-filterbank. The filters are stored in the rows, the columns correspond to fft bins. The filters are returned as an array of size nfilt * (nfft/2 + 1)

Parameters:

sr (int) – Sampling rate
nfilters (int) – The number of filters, default 20
nfft (int) – The size of FFT, default 512
lowfreq (int or float) – The lowest band edge of the mel filters, default 0 Hz
highfreq (int or float) – The highest band edge of the mel filters, default sr // 2

Returns:

_ – A numpy array of size nfilt * (nfft/2 + 1) containing filterbank. Each row holds 1 filter.

Return type:

numpy.ndarray

static lifter(cepstra, L=22)

Apply a cepstral lifter the the matrix of cepstra. This has the effect of increasing the magnitude of the high frequency DCT coeffs.

Liftering operation is similar to filtering operation in the frequency domain where a desired quefrency region for analysis is selected by multiplying the whole cepstrum by a rectangular window at the desired position. There are two types of liftering performed, low-time liftering and high-time liftering. Low-time liftering operation is performed to extract the vocal tract characteristics in the quefrency domain and high-time liftering is performed to get the excitation characteristics of the analysis speech frame.

Parameters:

cepstra (numpy.ndarray) – The matrix of mel-cepstra
L (int) – The liftering coefficient to use. Default is 22, since cepstra usually has 13 elements, 22 L will result almost half pi of sine lift. It essential try to emphasis to lower ceptral coefficient while deemphasize higher ceptral coefficient as they are less discriminative for speech contents.

plot(cmap='inferno', show_bar=True, offset=0, scale=1.0, xlim=None, ylim=None, x_as_time=True, nxlabel=8, ax=None, **kwargs)

Plot Amfcc.features via matshow, x is frames/time, y is the MFCCs

Parameters:

figsize ((float, float), optional, default: None) – Figure size, width, height in inches, Default = [6.4, 4.8]
cmap (str) – colormap for matplotlib. Default is ‘inferno’.
show_bar (bool, optional) – Default is True, show colorbar.
x_as_time (bool, optional) – Default is True, show x axis as time or sample index.
nxlabel (int, optional) – The amountt of labels on the x axis. Default is 8 .

class pya.Aserver(sr=44100, bs=None, device=None, channels=2, backend=None, **kwargs)

Pya audio server Based on pyaudio, works as a FIFO style audio stream pipeline, allowing Asig.play() to send audio segement into the stream.

Examples:

>>> from pya import *
>>> ser = Aserver()
>>> ser.boot()
AServer: sr: 44100, blocksize: ...,
         Stream Active: True, Device: ...
>>> asine = Ugen().sine()
>>> asine.play(server=ser)
Asig('sine'): 1 x 44100 @ 44100Hz = 1.000s cn=['0']

property device

default

static startup_default_server(**kwargs)

static shutdown_default_server()

__repr__(): Return repr(self).

get_devices(verbose=False): Return (and optionally print) available input and output device

set_device(idx, reboot=True)

Set audio device

Parameters:

idx (int) – Index of the device
reboot (bool) – If true the server will reboot. (Default value = True)

boot(): boot Aserver = start stream, setting its callback to this callback.

quit(): Aserver quit server: stop stream and terminate pa

play(asig, onset=0, out=0, **kwargs): Dispatch asigs or arrays for given onset.

_play_callback(in_data, frame_count, time_info, flag): callback function, called from pastream thread when data needed.

stop()

__del__()

class pya.Arecorder(sr=44100, bs=256, device=None, channels=None, backend=None, **kwargs)

Bases: pya.Aserver

pya audio recorder Based on pyaudio, uses callbacks to save audio data for pyaudio signals into ASigs

Examples:

>>> from pya import Arecorder
>>> import time
>>> ar = Arecorder().boot()
>>> ar.record()
>>> time.sleep(1)
>>> ar.stop()
>>> print(ar.recordings)  
[Asig(''): ... x ... @ 44100Hz = ...

property device

set_tracks(tracks, gains)

Define the number of track to be recorded and their gains.

Parameters:

tracks (list or numpy.ndarray) – A list of input channel indices. By default None (record all channels)
gains (list of numpy.ndarray) – A list of gains in decibel. Needs to be same length as tracks.

reset()

boot(): boot recorder

_recorder_callback(in_data, frame_count, time_info, flag): Callback function during streaming.

record(): Activate recording

pause(): Pause the recording, but the record_buffer remains

stop(): Stop recording, then stores the data from record_buffer into recordings

__repr__(): Return repr(self).

class pya.Ugen

Bases: pya.Asig.Asig

Unit Generator for to create Asig with predefined signal

sine(freq=440, amp=1.0, dur=None, n_rows=None, sr=44100, channels=1, cn=None, label='sine')

Generate Sine signal Asig object.

Parameters:

freq (int, float) – signal frequency (Default value = 440)
amp (int, float) – signal amplitude (Default value = 1.0)
dur (int, float) – duration in second. dur and num_samples only use one of the two. (Default value = 1.0)
num_samples (int) – number of sample. dur and num_samples only use one of the two(Default value = None)
sr (int) – sampling rate (Default value = 44100)
channels (int) – number of channels (Default value = 1)
cn (list of string) – channel names as a list. The size needs to match the number of channels (Default value = None)
label (string) – identifier of the object (Default value = “sine”)

Return type:

Asig object

cos(freq=440, amp=1.0, dur=None, n_rows=None, sr=44100, channels=1, cn=None, label='cosine')

Generate Cosine signal Asig object.

Parameters:

freq (int, float) – signal frequency (Default value = 440)
amp (int, float) – signal amplitude (Default value = 1.0)
dur (int, float) – duration in second. dur and num_samples only use one of the two. (Default value = 1.0)
num_samples (int) – number of sample. dur and num_samples only use one of the two(Default value = None)
sr (int) – sampling rate (Default value = 44100)
channels (int) – number of channels (Default value = 1)
cn (list of string) – channel names as a list. The size needs to match the number of channels (Default value = None)
label (string) – identifier of the object (Default value = “cosine”)

Return type:

Asig object

square(freq=440, amp=1.0, dur=None, n_rows=None, duty=0.5, sr=44100, sample_shift=0.5, channels=1, cn=None, label='square')

Generate square wave signal Asig object.

Parameters:

freq (int, float) – signal frequency (Default value = 440)
amp (int, float) – signal amplitude (Default value = 1.0)
dur (int, float) – duration in second. dur and num_samples only use one of the two. (Default value = 1.0)
num_samples (int) – number of sample. dur and num_samples only use one of the two(Default value = None)
duty (float) – duty cycle (Default value = 0.4)
sr (int) – sampling rate (Default value = 44100)
channels (int) – number of channels (Default value = 1)
cn (list of string) – channel names as a list. The size needs to match the number of channels (Default value = None)
label (string) – identifier of the object (Default value = “square”)

Return type:

Asig object

sawtooth(freq=440, amp=1.0, dur=None, n_rows=None, width=1.0, sr=44100, channels=1, cn=None, label='sawtooth')

Generate sawtooth wave signal Asig object.

Parameters:

freq (int, float) – signal frequency (Default value = 440)
amp (int, float) – signal amplitude (Default value = 1.0)
dur (int, float) – duration in second. dur and num_samples only use one of the two. (Default value = 1.0)
num_samples (int) – number of sample. dur and num_samples only use one of the two(Default value = None)
width (float) – tooth width (Default value = 1.0)
sr (int) – sampling rate (Default value = 44100)
channels (int) – number of channels (Default value = 1)
cn (list of string) – channel names as a list. The size needs to match the number of channels (Default value = None)
label (string) – identifier of the object (Default value = “sawtooth”)

Return type:

Asig object

noise(type='white', amp=1.0, dur=None, n_rows=None, sr=44100, channels=1, cn=None, label='noise')

Generate noise signal Asig object.

Parameters:

type (string) – type of noise, currently available: ‘white’ and ‘pink’ (Default value = ‘white’)
amp (int, float) – signal amplitude (Default value = 1.0)
dur (int, float) – duration in second. dur and num_samples only use one of the two. (Default value = 1.0)
num_samples (int) – number of sample. dur and num_samples only use one of the two(Default value = None)
sr (int) – sampling rate (Default value = 44100)
channels (int) – number of channels (Default value = 1)
cn (list of string) – channel names as a list. The size needs to match the number of channels (Default value = None)
label (string) – identifier of the object (Default value = “square”)

Return type:

Asig object

pya.basicplot(data, ticks, channels, offset=0, scale=1, cn=None, ax=None, typ='plot', cmap='inferno', xlim=None, ylim=None, xlabel='', ylabel='', show_bar=False, **kwargs)

Basic version of the plot for pya, this can be directly used by Asig. Aspec/Astft/Amfcc will have different extra setting and type.

Parameters:

data (numpy.ndarray) – data array
channels (int) – number of channels
axis (matplotlib.axes, optional) – Plot image on the matplotlib axis if it was given. Default is None, which use plt.gca()
typ (str, optional) – Plot type.

pya.gridplot(pya_objects, colwrap=1, cbar_ratio=0.04, figsize=None)

Create a grid plot of pya objects which have plot() methods, i.e. Asig, Aspec, Astft, Amfcc. It takes a list of pya_objects and plot each object into a grid. You can mix different types of plots together.

Examples

# plot all 4 different pya objects in 1 column, amfcc and astft use pcolormesh so colorbar will # be displayed as well gridplot([asig, amfcc, aspec, astft], colwrap=2,

cbar_ratio=0.08, figsize=[10, 10]);

Parameters:

pya_objects (iterable object) – A list of pya objects with the plot() method.
colwrap (int, optional) – Wrap column at position. Can be considered as the column size. Default is 1, meaning 1 column.
cbar_ratio (float, optional) – For each column create another column reserved for the colorbar. This is the ratio of the width relative to the plot. 0.04 means 4% of the width of the data plot.
figsize (tuple, optional) – width, height of the entire image in inches. Default size is (6.4, 4.8)

Returns:

fig – The plt.figure() object

Return type:

plt.figure()

pya.audio_read(fp)

exception pya._error

Bases: Exception

Common base class for all non-exit exceptions.

pya.linlin(x, smi, sma, dmi, dma)

Linear mapping

Parameters:

x (float) – input value
smi (float) – input range’s minimum
sma (float) – input range’s maximum
dmi (float) – input range’s minimum
dma –

Returns:

_ – mapped output

Return type:

float

pya.midicps(m): Convert midi number into cycle per second

pya.cpsmidi(c): Convert cycle per second into midi number

pya.dbamp(db): Convert db to amplitude

pya.ampdb(amp): Convert amplitude to db

pya.spectrum(sig, samples, channels, sr)

Return spectrum of a given signal. This method return spectrum matrix if input signal is multi-channels.

Parameters:

sig (numpy.ndarray) – signal array
samples (int) – total amount of samples
channels (int) – signal channels
sr (int) – sampling rate

Returns:

frq (numpy.ndarray) – frequencies
Y (numpy.ndarray) – FFT of the signal.

pya.normalize(d): Return the normalized input array

pya.audio_from_file(path, dtype=np.float32): Load an audio buffer using audioread. This loads one block at a time, and then concatenates the results.

pya.buf_to_float(x, n_bytes=2, dtype=np.float32)

Convert an integer buffer to floating point values. This is primarily useful when loading integer-valued wav data into numpy arrays. .. seealso:: buf_to_float

Parameters:

x (np.ndarray [dtype=int]) – The integer-valued data buffer
n_bytes (int [1, 2, 4]) – The number of bytes per sample in x
dtype (numeric type) – The target output type (default: 32-bit float)

Returns:

x_float – The input data buffer cast to floating point

Return type:

np.ndarray [dtype=float]

pya.device_info(): Return a formatted string about available audio devices and their info

pya.find_device(min_input=0, min_output=0)

pya.padding(x, width, tail=True, constant_values=0)

Pad signal with certain width, support 1-3D tensors. Use it to add silence to a signal TODO: CHECK pad array

Parameters:

x (np.ndarray) – A numpy array
width (int) – The amount of padding.
tail (bool) – If true pad to the tail, else pad to the start.
constant_values (int or float or None) – The value to be padded, add None will pad nan to the array

Returns:

_ – Padded array

Return type:

np.ndarray

pya.is_pow2(val): Check if input is a power of 2 return a bool result.

pya.next_pow2(x)

Find the closest pow of 2 that is great or equal or x, based on shift_bit_length

Parameters:: x (int) – A positive number
Returns:: _ – The cloest integer that is greater or equal to input x.
Return type:: int

pya.round_half_up(number): Round up if >= .5

pya.rolling_window(a, window, step=1)

pya.signal_to_frame(sig, n_per_frame, frame_step, window=None, stride_trick=True)

Frame a signal into overlapping frames.

Parameters:

sig (numpy.ndarray) – The audio signal
n_per_frame (int) – Number of samples each frame
frame_step (int) – Number of samples after the start of the previous frame that the next frame should begin.
window (numpy.ndarray or None) – A window array, e.g,
stride_trick (bool) – Use stride trick to compute the rolling window and window multiplication faster

Returns:

_ – an array of frames.

Return type:

numpy.ndarray

pya.magspec(frames, NFFT)

Compute the magnitude spectrum of each frame in frames. If frames is an NxD matrix, output will be Nx(NFFT/2+1).

Parameters:

frames (numpy.ndarray) – The framed array, each row is a frame, can be just a single frame.
NFFT (int) – FFT length. If NFFT > frame_len, the frames are zero_padded.

Returns:

_ – If frames is an NxD matrix, output will be Nx(NFFT/2+1). Each row will be the magnitude spectrum of the corresponding frame.

Return type:

numpy.ndarray

pya.powspec(frames, NFFT)

Compute the power spectrum of each frame in frames, first comeputer the magnitude spectrum

Parameters:

frames (numpy.ndarray) – Framed signal, can be just a single frame.
NFFT (int) – The FFT length to use. If NFFT > frame_len, the frames are zero-padded.

Returns:

_ – Power spectrum of the framed signal. Each row has the size of NFFT / 2 + 1 due to rfft.

Return type:

numpy array

pya.hz2mel(hz)

Convert a value in Hertz to Mels

Parameters:

hz (number of array) – value in Hz, can be an array
Returns –
-------- –
_ (number of array) – value in Mels, same type as the input.

pya.mel2hz(mel)

Convert a value in Hertz to Mels

Parameters:

hz (number of array) – value in Hz, can be an array
Returns –
-------- –
_ (number of array) – value in Mels, same type as the input.

class pya.DummyBackend(dummy_devices=None)

Bases: pya.backend.base.BackendBase

Helper class that provides a standard way to create an ABC using inheritance.

dtype = 'float32'

range = 1

bs = 256

get_device_count()

get_device_info_by_index(idx)

get_default_input_device_info()

get_default_output_device_info()

open(*args, input_flag, output_flag, rate, frames_per_buffer, channels, stream_callback=None, **kwargs)

process_buffer(buffer)

terminate()

class pya.PyAudioBackend(format=pyaudio.paFloat32)

Bases: pya.backend.base.BackendBase

Helper class that provides a standard way to create an ABC using inheritance.

_boot_delay = 0.5

bs = 512

get_device_count()

get_device_info_by_index(idx)

get_default_input_device_info()

get_default_output_device_info()

open(rate, channels, input_flag, output_flag, frames_per_buffer, input_device_index=None, output_device_index=None, start=True, input_host_api_specific_stream_info=None, output_host_api_specific_stream_info=None, stream_callback=None)

process_buffer(buffer)

terminate()

pya.determine_backend(force_webaudio=False, port=8765)

pya.startup(**kwargs)

pya.shutdown(**kwargs)

pya

Subpackages

Submodules

Package Contents

Classes

Functions

Examples:

Examples:

`pya`