We hope that the following frequently asked questions will help you in your continued journey to learn about SoundFont technology. If there is a question
you have which you do not see covered here, please do not hesitate to send an email to the webmaster - we will
do our very best to help you!
What Is This?
What are SoundFont Banks?
What is sampling?
Why should I choose SoundFont Technology?
Formats and Compatibility
Is the SoundFont 2.0 format public?
What devices and applications are SoundFont-compatible?
What is the difference between SoundFont 2.0 and SoundFont 2.1?
What devices are compatible with SoundFont 2.1?
Why?
Why was there a revision to SoundFont 1.0?
How/What To Do?
How can I tell if a bank is in SoundFont 1.0 or SoundFont 2.0 format?
What's Out There?
Where is the SoundFont 2.1 specification?
What SoundFont and SoundFont-compatible banks are available from E-MU / ENSONIQ today?
Where is E-MU's Professional SoundFont-Bank Editor? What is the Preditor Tool mentioned in the SoundFont 2.0 specification?
Details!
What is the overall structure of the SoundFont 2.0 format?
What kinds of articulation data are supported by SoundFont 2.0?
In what units is the articulation data stored?
How do "stereo" sounds work in SoundFont 2.0?
Who is controlling the SoundFont 2.1 specification?
What else can you tell me?
More Information
Where can I find more information and resources?
What are SoundFont banks?
SoundFont technology is a sample format that was invented by E-MU / ENSONIQ for the purpose of creating a flexible wavetable synthesis solution for Creative Labs.
E-MU / ENSONIQ brought their expertise, which has set the standard for professional sample formats, and created a solution that would be embraced for consumer and
professional applications.
What is sampling?
Sampling is a term that is used to describe a recorded sound or a large group of sounds that can be controlled and manipulated in real time with a synthesizer.
Samplers usually store these sounds in RAM (random access memory) so that the user will be able to replace the existing sounds with his or her own sounds or
purchase more sounds. This enables sampling hardware to be very flexible to meet the needs for a variety of audio applications.
Why should I choose SoundFont Technology?
Currently there are literally millions of SoundFont-compatible devices which have been purchased throughout the world, with Creative's AWE and Live! series of
sound cards accounting for most of that.
With sample technology, you are able to create a wide variety of expressive sounds that would require very little audio data. With the power of apply filters,
volume envelopes, pan instructions, effect sends, and LFO's (low frequency oscillators) that can control the value of much of these effects, the possibilities are
endless.
You can use SoundFont technology to enhance the sound in a game or use it as a number of musical instruments to create a sequence that would sound as if an
orchestra was playing your original music in your computer. SoundFont technology is a powerful tool that has already been deployed to the masses.
Is the SoundFont 2.0 format public?
Yes. E-MU / ENSONIQ and Creative Technology are actively promoting SoundFont 2.0 as an open standard. We have worked diligently on getting complete,
unambiguous documentation and a suite of tools available for developers who might want to use the SoundFont 2.0 format.
What devices and applications are SoundFont-compatible?
Please consult our list of SoundFont-compatible products.
What is the difference between SoundFont 2.0 and SoundFont 2.1?
SoundFont 2.1 offers more flexibility in how samples within a SoundFont bank can be controlled. For more information, please consult these documents:
Why SoundFont 2.1?
SoundFont 2.1 Specification
What devices are compatible with SoundFont 2.1?
At this time, Creative's Sound Blaster Audigy and Sound Blaster Live! for Macintosh, and E-MU's Audio Production Studio,
E-Card and E-MU PC are the only SoundFont 2.1-compatible devices.
Why was there a revision to SoundFont 1.0?
After a year of experience with SoundFont 1.0, E-MU / ENSONIQ and Creative Technology realized there had been a number of omissions in the SoundFont 1.0
realization of the original SoundFont concept. Rather than sticking to the SoundFont 1.0 format and putting up with these inadequacies, E-mu/Creative
determined that the public release of the SoundFont standard would be a good time to make a clean sweep and revised the format to include all the necessary
features.
How can I tell if a bank is in SoundFont 1.0 or SoundFont 2.0 format?
PC files in SoundFont 1.0 format conventionally have the suffix .sbk, while such files in SoundFont 2.0 format have the suffix .sf2. Data contained within
the information portion of the SoundFont format also specifies the version of the format. The portion of the bank which contains the format version was
left unchanged with the revision, so a single software method may be used to distinguish the formats.
Where is the SoundFont 2.1 specification?
The SoundFont 2.1 specification may be found here.
What SoundFont and SoundFont-compatible banks are available from E-MU / ENSONIQ today?
E-MU / ENSONIQ has a number of SoundFont bank products available for purchase at this site. Click here to shop for
sounds at SoundFont.com!
Where is E-MU's Professional SoundFont-Bank Editor? What is the Preditor Tool mentioned in the SoundFont 2.0 specification?
While SoundFont 2.0 was in the specification stage, E-MU / ENSONIQ began developing an internal editing and auditioning utility project in order to test and
continually improve the newly revamped format. That project was code named "Preditor" and was originally being considered for public release as a
professional SoundFont bank editor.
However for various reasons, mostly dealing with the cost of packaging, distribution, and customer support, "Preditor" was NOT made into a finished product.
What is the overall structure of the SoundFont 2.0 format?
A SoundFont bank (or SoundFont-compatible bank) contains both the digital audio samples which have been captured from a sound source, and the instructions
to the wavetable synthesizer on how to articulate this sound based on the musical or sonic context as expressed by MIDI (called articulation data). A
SoundFont bank also contains information about the creation and intended use of the particular bank.
A SoundFont bank is stored in the industry standard RIFF format. There are 3 major portions or "chunks" contained within a SoundFont bank, one for
information, one for sample data, and one for articulation data.
The information chunk contains information about the bank. The Sample Data chunk contains the digitized waveforms used in the SoundFont bank itself. The
articulation data chunk is a bit more involved.
The articulation data chunk uses two levels of articulation data which sits atop a level of sample header data which describes sample data itself. Each
level of articulation data references the level beneath it, thus allowing for reuse of resources.
SF2 Articulation Data Hierarchy:
PRESET LEVEL
.
.
.
Preset m
Name
Variation Bank Index
Program Index
Layer n
Articulation data
Instrument i Reference
Layer n+1
Articulation data
Instrument h Reference
.
.
.
Preset m+1
Name
Variation Bank Index
Program Index
Layer o
Articulation data
Instrument p Reference
Layer o+1
Articulation data
Instrument a Reference
.
.
.
INSTRUMENT LEVEL
.
.
.
Instrument i
Name
Split j
Articulation Data
Sample s Reference
Split k
Articulation Data
Sample b Reference
.
.
.
Instrument i+1
Name
Split l
Articulation Data
Sample s Reference
Split l+1
Articulation Data
Sample t Reference
.
.
.
SAMPLE HEADER LEVEL
.
.
.
Sample s
Name
Location and loop points
Information
Sample s+1
Name
Location and loop points
Information
.
.
.
The lowest level of articulation data is called the "Sample Header" level. It contains a list of data structures, each describing one particular waveform
contained within the SoundFont bank or a particular waveform contained within a wavetable ROM. Each data structure has a unique name for each sample,
information regarding the location of the sample relative to the beginning of the sample data contained within the bank (or the absolute location of the
sample in a particular wavetable ROM), sample loop points, original sample rate of the digitized waveform, and other information about the samples.
The combination of the Sample Header data and the sample data for a single sampled waveform in a SoundFont bank is the rough equivalent of a WAV file.
However, stereo SoundFont sounds are quite different than stereo WAV sounds. See question below for details.
The next highest level of articulation data is called the "Instrument" level. Instruments are subdivided into "Splits". A Split is the combination of a key
range and/or a velocity range, a reference to a particular sample header within the SoundFont bank, and articulation data which is applied directly to that
sample.
By referencing samples, it is possible to have any split articulate any sample in the bank.
IE:
INSTRUMENT LEVEL
Instrument 1:
Split 1
Key 0-127
Articulation Data
Sample q
.
.
.
Instrument y:
Split 1
Key 0-127
Articulation Data
Sample q
.
.
.
SAMPLE LEVEL
.
.
Sample q:
Loop points, etc
.
.
By allowing multiple splits within a single instrument, it is possible to have different samples played back with varying keynumber and velocity, each with
independent articulation.
IE:
INSTRUMENT LEVEL
Instrument 1:
Split 1
Key 43-43
Articulation Data
Sample q
Split 2
Key 44-44
Vel 0-120
Other Articulation Data
Sample x
Split 3
Key 44-44
Vel 101-127
Other Articulation Data
Sample q
Split 4
Key 0-42
Other Articulation Data
Sample a
.
.
.
.
SAMPLE LEVEL
.
.
Sample a:
Loop points, etc
.
.
Sample q:
Loop points, etc
.
.
Sample x:
Loop points, etc
One can see that keys 0-42 plays sample a, and key 43 plays sample q. So key range is variable. Also one can see that key number 44 plays sample x when
struck softly, sample q when struck hard, and BOTH samples when struck medium. So velocity sensitive samples is supported.
These features can be used to form drumkits, finely tuned complex instruments which do not sound "munchkin-like" after pitch shifting, or velocity dependent
sounds.
A combination of splits (which (typically) collectively span the range of possible keynumbers and velocity values) is collected to form an "Instrument".
The highest level of articulation data is called the "Preset" level. Presets are subdivided into "Layers". A Layer is the combination of a key range and/or
a velocity range, a reference to a particular instrument within the SoundFont bank, and articulation data which is applied RELATIVE to corresponding
articulation data within that instrument.
By referencing instruments, it is possible to have any layer articulate any instrument in the bank. By applying articulation data at the Preset level
RELATIVE to data at the Instrument level, it is possible to customize instruments which are professionally designed and fine tuned without having to
destroy the original material.
IE:
PRESET LEVEL
Preset 1:
Layer 1
Key 0-127
Add i% Reverb
Instrument x
.
.
.
INSTRUMENT LEVEL
Instrument 1:
Split 1
.
.
.
Here, i% reverb is ADDED to WHATEVER the reverb is on WHATEVER samples are used in the instrument!
By allowing multiple layers within a single Preset, it is possible to have different instruments exist at different parts of the keyboard or in different
velocity ranges, each with independent articulation.
This can be used to make presets in which your left hand plays a bass guitar and your right plays a piano:
PRESET LEVEL
Preset 1:
Layer 1
Key 0-63
Instrument "Bass Guitar"
Layer 2
Key 64-127
Instrument "Piano"
chorded versions of instruments:
PRESET LEVEL
Preset 1:
Layer 1
Key 0-127
Instrument "Piano"
Layer 2
Key 0-127
Add 4 semitones of pitch tuning
Instrument "Piano"
Layer 3
Key 0-127
Add 7 semitones of pitch tuning
Instrument "Piano"
velocity sensitive instruments:
PRESET LEVEL
Preset 1:
Layer 1
Key 0-127
Vel 0-100
Instrument "Piano"
Layer 2
Key 0-127
Vel 101-127
Instrument "Bass Guitar"
or any other custom variations of finely tuned instruments. All of that without having the user worry about samples or loop points or copying of large
numbers of splits or other such complex issues. Expert users of Vienna 1.0 should be able to recognize that creating presets like the above would be a
nightmare with the SoundFont 1.0 format! (EsBeeKay users COULD do the Piano/Bass Guitar, but not the chorded piano nor the add i% reverb)
A combination of layers (which (typically) collectively span the range of possible keynumbers and velocity values) is collected to form a "Preset".
Each Preset has a corresponding "variation bank" index and "program index" which are directly analogous to MIDI "bank change" and "program change"
commands.
A SoundFont bank is viewed as the combination of a collection of Presets, Instruments, and Samples, each self contained and only dependent on the level
beneath it.
IE:
A single sample header and specified sample data are viewed as an entity.
A single instrument and all of the samples it uses are viewed as an entity.
A single preset and all of the instruments it uses are viewed as an entity.
What kinds of articulation data are supported by SoundFont 2.0?
Articulation data which is applied to a sample at note-on time is supported. Such articulation data is called a 'generator'. The list of generators includes
envelope, LFO, and filter parameters.
The ability to modify the value of any generator based on MIDI or computer generated events is also supported. This includes key scaling and MIDI Continuous
Controller routing to any 'generator'. Such articulation data is called a 'modulator' The list of possible modulators includes 'keynumber to filter
cutoff', 'MIDI CC 1 to Tremelo', etc. Note that modulators consist of a SOURCE, a DESTINATION, and an AMOUNT. The SOURCE (keynumber, MIDI CC, etc)
is a MIDI or computer generated event, whereas the DESTINATION (Filter Cutoff, Tremelo, etc) is a GENERATOR. The AMOUNT by which a source changes a
destination is programmable, and that amount can either be static or controlled by another SOURCE.
Some features which sound like they should be 'modulators' are so important that we made them 'generators'. An example of this is 'Keynumber to Volume
Envelope Decay Time' being a GENERATOR. This parameter is vital in creation of certain instruments (such as a piano).
Generators and Modulators may be used at the Preset Level and at the Instrument Level of a SoundFont bank. A comprehensive list of currently defined
generators and modulators may be seen in the SoundFont 2.1 specification.
In what units is the articulation data stored?
All articulation parameters in ALL levels are stored in REAL WORLD well defined units with resolutions at the MINIMUM PERCEPTUAL level. Thus the data does
not favor any single synthesizer over any other (except synthesizers which use the same units...) This allows the format to be easily portable and
exchangeable.
All units have the additional property of being "perceptually additive". This makes it such that when the Preset Level articulation or a Real-Time Modulator
"adds" amounts to corresponding articulation in an Instrument, its effects are perceived as the same, regardless of the value of that data within the
instrument.
Examples of units with these properties would be "pitch in cents" or "attenuation in centibels."
How do "stereo" sounds work in SoundFont 2.0?
SoundFont banks do not contain "interleaved" samples, instead the sample header data contains "sample links" in which individual samples "point" to other
samples which are their stereo counterpart (or which are linked in a chain of samples.) These references specify pairs or groups of sampled waveforms which
are to be played in perfect pitch phase. Those samples are played in perfect pitch phase IF both or all samples are used in the same INSTRUMENT. With this
approach, it is possible to use one sample in a stereo pair independently as a mono sample or as one of a stereo pair of samples, or combinations of both in
different instruments within the same SoundFont bank.
Each sample header contains a field where it declares itself as a mono or a master or a slave sample, and another field where it has a reference to the
other sample, or the next sample in the chain.
Stereo sounds are be produced when an INSTRUMENT contains two splits, each of which point to the two stereo samples, and each of which are triggered with
the same key/velocity combination. These splits may have INDEPENDENT articulation parameters settings (including pan position) EXCEPT those involving pitch
modulation.
IE:
Preset "Stereo Sound"
Layer 1
Instrument "Stereo Instrument"
Split 1
Articulation Data Parameters
Sample "Right Sample" (Master, points to "Left Sample")
Split 2
Articulation Data Parameters
Sample "Left Sample" (Slave, points to "Right Sample")
In the case of pitch, one split (called the MASTER split) controls the pitch for its sample AND the sample in the OTHER SPLIT (or SLAVE split). So changing
pitch modulation parameters (such as LFO to Pitch or fine tuning) in a SLAVE split does nothing, but changing such parameters in a MASTER split has that
modulation apply to BOTH samples!
There is no other "automatic" parameter settings in SoundFont stereo sounds. Pan position may be wherever you like. (no automatic pan position of stereo
paired samples). Each sample may be played at the loudness they are both individually capable of (no automatic attenuation of stereo paired sounds). Each
sample gets its own individual attenuation and filter settings! (no automatic articulation data copying from master split to slave split in stereo paired
sounds). Thus, it is POSSIBLE to build a classic stereo sound (pan hard left/right, half volume on each, same articulation on each) in a SoundFont 2.0 bank
but you are not CONFINED to those settings!
If a keynumber/velocity combination triggers only ONE split in an instrument and that split contains ONE of a pair of stereo samples, then you hear ONLY
that sample in MONO. This way, you can use the samples individually or in stereo if you like.
So if an INSTRUMENT only contains ONE sample in a stereo pair, that is a MONO instrument.
IE:
Preset "Mono Sound"
Layer 1
Instrument "Mono Instrument"
Split 1
Articulation Data Parameters
Sample "Right Sample" (Master, points to "Left Sample")
Even though the "Right Sample" still points to the "Left Sample",the "Left Sample" is never used in the "Mono Instrument". So that is a mono sound with
just the one sample!
If a PRESET contains two INSTRUMENTS, one of which holds only ONE sample in a stereo pair, and the other of which holds only THE OTHER sample in a stereo
pair, a stereo sound does NOT result. That would have the effect of FUNDAMENTALLY changing the nature of an instrument with nothing more than the
inclusion of another instrument. This defies the 'instrument as an entity' property.
IE:
Preset "NOT A Stereo Sound"
Layer 1
Instrument "Mono Instrument"
Split 1
Articulation Data Parameters
Sample "Right Sample" (Master, points to "Left Sample")
Layer 2
Instrument "Other Mono Instrument"
Split 1
Articulation Data Parameters
Sample "Left Sample" (Slave, points to "Right Sample")
Even though the "Right Sample" still points to the "Left Sample", AND the "Left Sample" IS used in another instrument, this
is NOT a stereo sound!
A single SoundFont bank may be made with all of the above examples:
PRESET LEVEL
Preset 1: "Stereo Sound"
Layer 1
Instrument 1
Preset 2: "Mono Sound"
Layer 1
Instrument 2
Preset 3: "NOT A Stereo Sound"
Layer 1
Instrument 2
Layer 2
Instrument 3
Preset 4: "Some other unrelated sound"
Layer 1
Instrument 4
INSTRUMENT LEVEL
Instrument 1: "Stereo Instrument"
Split 1
Articulation Data Parameters
Sample 1
Split 2
Articulation Data Parameters
Sample 2
Instrument 2: "Mono Instrument"
Split 1
Articulation Data Parameters
Sample 2
Instrument 3: "Other Mono Instrument"
Split 1
Articulation Data Parameters
Sample 2
Instrument 4: "Some other unrelated instrument"
Split 1
Articulation Data Parameters
Sample 3
SAMPLE LEVEL
Sample 1: "Right Sample" (Master, points to "Left Sample")
Loop points, etc
Master
Points to Sample 2
Sample 2: "Left Sample" (Slave, points to "Right Sample")
Loop points, etc
Slave
Points to Sample 1
Sample 3: "Some other sample"
Loop points, etc
Mono
Pointer is irrelevant
This example does not use PRESET LEVEL Articulation Data Parameters, but you can do that too!
Finally, "stereo" sound in SoundFont 2.0 is merely a subset of general number of pitch phase locked sample sounds. In the case of "stereo" sound, that
"general number" is 2. Thus it is possible to pitch phase lock as many voices as you like, so long as the link information in the sample header is used to
form a CLOSED CHAIN of samples.
IE Sample 1 points to Sample 2 which points to Sample 3...which points to Sample n which points back to Sample 1.
Who is controlling the SoundFont 2.1 specification?
As of the date of this document, the SoundFont 2.0 specification is controlled by the
Creative Advanced Technology Center.
What else can you tell me?
For further details regarding how splits/layers are contained, what articulation parameters are defined and available, and what specific units those
parameters are stored in, please see the SoundFont 2.1 specification.
Where can I get more information and resources?
For more information on SoundFont technology, please consult our tutorials section.
|