last updated: May 9, 2012
1. What is the problem of converting 24-bit data files to 16-bit data files? |
|||
In BioSemi's BDF data format, each analog value is converted to a digital 3-byte or 24-bit represention, meaning that integer numbers ranging from -8,388,608 to + 8,388,607 can be used to represent any particular analog value. BioSemi's ActiveTwo hardware setup records a physical data range from -262,144 to +262,143 µV (524 mV; see BioSemi's ActiveTwo documentation on a 24-bit system and digital resolution), which means that each integer value corresponds to a gain of 0.03125 µV (524,288 µV / 16,777,216).
In contrast, systems using a 2-byte or 16-bit data storage format can represent any particular analog value only by integer numbers ranging from -32,768 to + 32,767. A typical gain would be 0.5, meaning that a physical data range from -16,384 to +16,384µV would be covered. Data outside these range limits would saturate or "pin" at these integer minima and maxima. The above graphic depicts to scale the (signed) 24-bit and 16-bit data recording ranges (512 pixel in green for 24-bit, 2 pixel in red for 16-bit). If one would try to record a physical data range from -262,144 to +262,143 µV with 16-bit, one would have to sacrifice gain resolution: one integer would correspond to a gain of 8 µV (524,288 µV / 65,536). Alternatively, one could keep the original gain of 0.03125 µV per integer value, but in this case the 16-bit data range could only cover a physical data range from -1,024 to +1,024 µV. However, this would still correspond to a reasonable physical range for EEG data, in which blink amplitudes rarely exceed 500 µV. The problem, of course, is that the acquired data can fluctuate anywhere within the 24-bit integer range, particularly with BioSemi's true DC recordings (no physical high pass filter), and artifactual data fluctuations may spanned the entire recording range (BioSemi uses only digital filter). The decision is then to either loose the data outside the A/D conversion range and keep a reasonable A/D data resolution, or keep all recorded data with low A/D data resolution. Alternatively, one would try to compromise between these two extremes. |
2. Sometimes the conversion seems to be fine, but when converting large data files it appears that some data are lost during the conversion? |
|||
When the converted data are displayed (e.g., with NeuroScan's Edit program), they may to look "crude" or "chopped", not "like EEG" data (cf. left panel below). This is a direct result of the 24- to 16-bit down-conversion using a low gain resolution (more µV per integer value). The presence of some kind of recording artifact is a frequent cause for creating a low gain resolution when using the PolyRex default settings. In contrast, using a high gain resolution (less µV per integer value) during conversion would result in a more resonable representation (cf. right panel below; both panels have the same time and µV scales). The conversion settings of PolyRex can be changed to accomplish these results at the expense of loosing (artifactual) data. |
|||
![]() |
|||
Both the specifics of the recorded data and the selected conversion settings of PolyRex will affect the gain that is used during data conversion. The gain that is used during the conversion is reported on the destination matrix form (below left), while the original recording gain is reported on the source matrix form (below right).
|
|||
![]() |
![]() |
||
There are several ways to influence the gain selection of PolyRex. First, there are basic settings for gains and offsets, which should be chosen according to the conversion goal. Second, rereferencing the recorded data "on-the-fly" during conversion will also affect the conversion gain. Lastly, one can try to identify the source for a low gain conversion using the gain and level display, and exclude data records or flag bad channels accordingly. |
3. How do the settings for gains and offsets affect the conversion process? |
The whole purpose of PolyRex is to optimize the 24- to 16-bit conversion process, trying to keep the loss of gain resolution to a minimum. By checking the 'Remove recording offsets' option, PolyRex will analyze the integer range of the recorded data in a first data scan, and subtract its mean, an estimate for the overall recording offset level, before integers are written to the 16-bit data file. As a result, only the actual recorded integer range has to be squeezed into the 16-bit integer integer. If the width of the recorded integer range is less than target integer range (i.e., 65,536 or less), no loss in gain resolution will occur. Leaving this option unchecked will convert the data as recorded, and will result in data saturation for integers outside the 16-bit range. If data get saturated during the conversion process, PolyRex will indicate these as occurences of over- and underflow on the main form, and also in the log notes. Checking the option 'Adjust break offsets at epoch intersections (recording on/off)' will exploit the information when data acquisition was turned off and on again within the same data file (BioSemi's ActiView acquisition software flags these events). Most likely, the overall recording EEG level has changed during the acquisition pause, resulting in sharp recording 'offsets' or 'jumps' within the recording. PolyRex will eliminate these 'break' offsets to further optimize data conversion. For most pratical EEG or ERP purposes, the overall recording level has no informational value. Nevertheless, these data can be stored in the header of the converted NeuroScan file, or written to an external ASCII file, if this information is needed.
Unless the 'Adjust 24-bit integer range' option is checked under 'Gains', no attempt will be made to change the original integer values. The integer range slider below this option, together the edit fields 'Restric range to:' will set the target integer range. Although less the total 16-bit range of 65,535 will further limit the gain resolution, it may be advisable to avoid using the full range, as follow-up data processing (e.g., blink reduction, filtering, baseline correction) may result in data saturation (PolyRex uses 50% as the default). There are three modes to compute a new gain: 1) 'Separately for each channel' will optimize the gain resoultion for each channels, but will most likely result in different gains across channels; 2) 'Across all included channels' will produce the same gain for each channel after determining the integer recording range across all channels; and 3) 'Apply a fixed gain value' determined by the user. Whereas options 1 and 2 guarantee that the recorded data stay within the conversion target range (unless bad channels are flagged), option 3 may result in data saturation (these occurences are reported in the log notes and indicated on the main form). [Note: Since version 1.1.3.7, PolyRex will compute and remove the median integer level from the converted data for option 3, as this will likely leave most of the 'good' recording periods within the target conversion range.] |
4. What is the purpose of the A/D gain histogram and the integer recording level display? |
The 'Analyze' button on the main form will allow to determine the best conversion settings by previewing the effects of the new gain calculation, because it forces PolyRex to analyze the integer range of recorded data. The gain histogram provides a quick feedback how the gain will differ between channels after conversion (cf. graphic below). However, only the 'Calculate new gain' option 'Separately for each channel' under the 'Offsets and Gains' settings can produce different gains for each channel, otherwise gains will be identical for all channels (gains may also be identical, for example, if no fit is necessary, thereby keeping the optimal recording gain). The lowest and highest gain values are listed in the upper left corner of the display. A left-mouse click on the gain histogram (or via right-mouse pop-menu) will show the original integer recording levels for the entire recording. As these values are computed as integer means for each data record (usually 1 s) and each channel, they allow a rough evaluation of how recording levels fluctuate across the entire recording. Artifacts, drifts, or any other recording problems can be realized and addressed accordingly, for instance, by flagging a bad channel, or by excluding artifactual recording epochs. The example below identifies a recording artifact in channel T8 (green line) between 8 and 9 min. Note that integer values for channel T8 are varying across the entire positive 24-bit integer range, but not for the other three EEG channels. A closer look at the event table on the converted events reveals no stimulus or response activity between 340 and 652 s, suggesting that EEG data were collected during an intermission. This problem could be solved by converting the file in two conversion runs, selecting only data records before (i.e., up to 340 s) and after the recording artifact (i.e., from 652 s to the end). Alternatively, a fixed gain may be determinded based on the projected gains of the other channels, which will saturate channel T8 during the period of the recording artifact. Or, a bad channel flag may be set for T8. |
5. How does re-referencing the recorded data affect the gain resolution? |
By default, BioSemi's stores all acquired data without a physical recording reference (see BioSemi's documentation for using an active reference with a passive electrode). While data may be converted in this original, "reference-free" state, it is more useful to convert the data to a physical reference using one (or more) of the electrodes included in the recording montage. This can be done "on-the-fly" during the conversion process by checking the 'Rereference' option under 'Channels and Reference' settings, and highlighting one (or more) channels in the reference montage box (use the Shift-key to select more than one channel). All data transformations are performed before determining an appropriate gain, thereby affecting the computation of the observed recording integer range that has to fit into the target integer range. |
7. How can I exclude artifactual data records from the conversion? |
Sometimes, the original recording includes data recorded during unattended time intervals (e.g., at the beginning or end of a recording block, during session breaks, etc.), which frequently causes all kinds of EEG recording artifacts (movements, current induction, etc.; see an artifact example above). Due to the nature of a wide aquisition range of the analog signal, these artifacts will not easily saturate (as they do with a restricted 16-bit acquisition range), but rather will be recorded as they are (i.e., huge artifacts). By default, PolyRex will try to accomodate the 24-bit recording range and fit the observed signal into a 16-bit target range, which will result in an undesirable low gain resolution. With the aid of the A/D gain histogram and the integer level display, artifact time periods and the contributing channels can be identified. The main window allows to convert only a subrange of the original data records. The example to the right shows that the original file holds 1304 data records with a duration of 1 s/record. Checking the 'Range' option, and specifying the subrange of data records (e.g., from 1 to 527) will restrict the conversion (and all gain calculations) to this subrange (i.e., from the begining of the recording up to recording time 8 min 47 s = 527 s). [Note that this option must be unchecked or adjusted for a different data file.] |
8. Is it preferable to use separate gain calculations for each channel, rather than using a fixed gain or a single gain calculation across all channels? |
There is no easy answer to this question: it depends on the data and the intended use. Calculating the same gain across all channels will limit the gain resolution for all channels to the gain of the "weakest" contributor; however, it has the advantage of having the same gain for all channels, which is most comparable to the acquisition scenario (usually, all EEG amplifiers are set to the same gain). Obviously, if the "weakest" contributing channel provides a very bad gain, all other (perfectly good) channels will have the same bad gain. Using a well-chosen fixed gain value may provide a reasonable comprise, as this will (hopefully) saturate the weakest channel during artifactual recording periods (see Offset and Gain). A low gain does have a significant impact on the raw data, and therefore on all EEG measures generated directly from the raw data (e.g., spectral measures). The impact on secondary EEG measures, such as ERPs, which are generated by averaging (many) EEG epochs, will depend on the overall signal-to-noise ratio (SNR). With a sufficient high SNR, stemming from either a large ERP signal (i.e., components), low background noise, or both, it is very likely that small differences in acquisition gain are negligible or non-existent in the averaged ERP waveform. |
9. What is a reasonable gain? |
As a rule of thumb for typical EEG or ERP studies, one would like to avoid the gain dropping below 1 µV/integer value. Any gain equal or better than 0.5 µV/integer value would be comparable to the gain typically accomplished with a conventional 16-bit EEG acquisition system. Ultimately, the gain selection must be based on the objectives or priorities of the study. Ideally, PolyRex will convert the 24-bit data without any loss in the original recording gain of 0.03125 µV/integer value. |
13. Are there any procedures for pre-filtering the data prior to conversion? |
As a simple data conversion software, PolyRex is not intended to perform other typical data post-processing routines, although some functions (re-referencing, adding EOG channels) are provided. In fact, the conversion should allow the use of NeuroScan's EDIT program for routine data processing, including filtering. While applying a high pass filter to the data before conversion could help improving the gain, it will also remove the DC recording property of the BioSemi aquisition system, which may or may not interfere with the research objective. Pre-conversion filtering could probably be accomplished with Matlab (there are links to free MatLab code from third-party developers to read BioSemi data files on BioSemi's download page), but obviously filtered data need to be saved to *.BDF format for subsequent use with PolyRex. |