聲音是人類傳遞信息的重要途徑,如果應(yīng)用程序中包含聲音信息,就可以大大增強(qiáng)它的親合力;另外在科研開發(fā)過程中,聲音信號的處理也是一個很重要的科學(xué)研究領(lǐng)域。Visual C++作為一個強(qiáng)大的開發(fā)工具,當(dāng)然是聲音處理的首選工具,但是在當(dāng)前Visual C++相關(guān)的編程資料中,無論是大部頭的參考書,還是一些計(jì)算機(jī)雜志,對聲音文件的處理都是泛泛的涉及一下,許多編程愛好者都感到對該部分的內(nèi)容了解不是很透徹,筆者結(jié)合自己的學(xué)習(xí)和開發(fā)過程中積累的一些經(jīng)驗(yàn),在本實(shí)例中來和廣大編程愛好者們探討一下聲音文件的處理,當(dāng)然本實(shí)例也不可能包括聲音處理內(nèi)容的方方面面,只是希望它能夠?qū)倓偵婕暗铰曇籼幚眍I(lǐng)域的朋友們起到一個引路的作用,幫助他們盡快進(jìn)入聲音處理的更深奧空間。
當(dāng)前計(jì)算機(jī)系統(tǒng)處理聲音文件有兩種辦法:一是使用現(xiàn)成的軟件,如微軟的錄音機(jī)、SoundForge、CoolEdit等軟件可以實(shí)現(xiàn)對聲音信號進(jìn)行錄音、編輯、播放的處理,但它們的功能是有限的,為了更靈活,更大限度地處理聲音數(shù)據(jù),就不得不使用另外一種方法,既利用微軟提供的多媒體服務(wù),在Windows環(huán)境下自己編寫程序來進(jìn)行聲音處理來實(shí)現(xiàn)一些特定的功能。下面就開始介紹聲音文件的格式和在Windows環(huán)境下使用Visual C++開發(fā)工具進(jìn)行聲音文件編程處理的方法。
一、實(shí)現(xiàn)方法
1、RIFF文件結(jié)構(gòu)和WAVE文件格式
Windows支持兩種RIFF(Resource Interchange File Format,"資源交互文件格式")格式的音頻文件:MIDI的RMID文件和波形音頻文件格式WAVE文件,其中在計(jì)算機(jī)領(lǐng)域最常用的數(shù)字化聲音文件格式是后者,它是微軟專門為Windows系統(tǒng)定義的波形文件格式(Waveform Audio),由于其擴(kuò)展名為"*.wav",因而該類文件也被稱為WAVE文件。為了突出重點(diǎn),有的放矢,本文涉及到的聲音文件所指的就是WAVE文件。常見的WAVE語音文件主要有兩種,分別對應(yīng)于單聲道(11.025KHz采樣率、8Bit的采樣值)和雙聲道(44.1KHz采樣率、16Bit的采樣值)。這里的采樣率是指聲音信號在進(jìn)行"模→數(shù)"轉(zhuǎn)換過程中單位時間內(nèi)采樣的次數(shù)。采樣值是指每一次采樣周期內(nèi)聲音模擬信號的積分值。對于單聲道聲音文件,采樣數(shù)據(jù)為八位的短整數(shù)(short int 00H-FFH);而對于雙聲道立體聲聲音文件,每次采樣數(shù)據(jù)為一個16位的整數(shù)(int),高八位和低八位分別代表左右兩個聲道。WAVE文件數(shù)據(jù)塊包含以脈沖編碼調(diào)制(PCM)格式表示的樣本。在進(jìn)行聲音編程處理以前,首先讓我們來了解一下RIFF文件和WAVE文件格式。
RIFF文件結(jié)構(gòu)可以看作是樹狀結(jié)構(gòu),其基本構(gòu)成是稱為"塊"(Chunk)的單元,每個塊有"標(biāo)志符"、"數(shù)據(jù)大小"及"數(shù)據(jù)"所組成,塊的結(jié)構(gòu)如圖1所示:
塊的標(biāo)志符(4BYTES) |
數(shù)據(jù)大小 (4BYTES) |
數(shù)據(jù) |
RIFF/LIST標(biāo)志符 | |
數(shù)據(jù)1大小 | |
數(shù)據(jù)1 | 格式/列表類型 |
數(shù)據(jù) |
標(biāo)志符(RIFF) |
數(shù)據(jù)大小 |
格式類型("WAVE") |
"fmt" |
Sizeof(PCMWAVEFORMAT) |
PCMWAVEFORMAT |
"data" |
聲音數(shù)據(jù)大小 |
聲音數(shù)據(jù) |
Typedef struct { WAVEFORMAT wf;//波形格式; WORD wBitsPerSample;//WAVE文件的采樣大?。?br>}PCMWAVEFORMAT; WAVEFORMAT結(jié)構(gòu)定義如下: typedef struct { WORD wFormatag;//編碼格式,包括WAVE_FORMAT_PCM,WAVEFORMAT_ADPCM等 WORD nChannls;//聲道數(shù),單聲道為1,雙聲道為2; DWORD nSamplesPerSec;//采樣頻率; DWORD nAvgBytesperSec;//每秒的數(shù)據(jù)量; WORD nBlockAlign;//塊對齊; }WAVEFORMAT; |
采樣一 | 采樣二 | …… | ||
低字節(jié) | 高字節(jié) | 低字節(jié) | 高字節(jié) | …… |
采樣一 | …… | |||
左聲道 | 右聲道 | …… | ||
低字節(jié) | 高字節(jié) | 低字節(jié) | 高字節(jié) | …… |
圖四、WAVE文件數(shù)據(jù)采樣格式
2、聲音文件的聲音數(shù)據(jù)的讀取操作
操作聲音文件,也就是將WAVE文件打開,獲取其中的聲音數(shù)據(jù),根據(jù)所需要的聲音數(shù)據(jù)處理算法,進(jìn)行相應(yīng)的數(shù)學(xué)運(yùn)算,然后將結(jié)果重新存儲與WAVE格式的文件中去??梢允褂肅FILE類來實(shí)現(xiàn)讀取操作,也可以使用另外一種方法,拿就是使用Windows提供的多媒體處理函數(shù)(這些函數(shù)都以mmino打頭)。這里就介紹如何使用這些相關(guān)的函數(shù)來獲取聲音文件的數(shù)據(jù),至于如何進(jìn)行處理,那要根據(jù)你的目的來選擇不同的算法了。WAVE文件的操作流程如下:1)調(diào)用mminoOpen函數(shù)來打開WAVE文件,獲取HMMIO類型的文件句柄;2)根據(jù)WAVE文件的結(jié)構(gòu),調(diào)用mmioRead、mmioWrite和mmioSeek函數(shù)實(shí)現(xiàn)文件的讀、寫和定位操作;3)調(diào)用mmioClose函數(shù)來關(guān)閉WAVE文件。
下面的函數(shù)代碼就是根據(jù)WAVE文件的格式,實(shí)現(xiàn)了讀取雙聲道立體聲數(shù)據(jù),但是在使用下面的代碼過程中,注意需要在程序中鏈接Winmm.lib庫,并且包含頭文件"Mmsystem.h"。BYTE * GetData(Cstring *pString)
//獲取聲音文件數(shù)據(jù)的函數(shù),pString參數(shù)指向要打開的聲音文件;
{
if (pString==NULL)
return NULL;
HMMIO file1;//定義HMMIO文件句柄;
file1=mmioOpen((LPSTR)pString,NULL,MMIO_READWRITE);
//以讀寫模式打開所給的WAVE文件;
if(file1==NULL)
{
MessageBox("WAVE文件打開失??!");
Return NULL;
}
char style[4];//定義一個四字節(jié)的數(shù)據(jù),用來存放文件的類型;
mmioSeek(file1,8,SEEK_SET);//定位到WAVE文件的類型位置
mmioRead(file1,style,4);
if(style[0]!='W'||style[1]!='A'||style[2]!='V'||style[3]!='E')
//判斷該文件是否為"WAVE"文件格式
{
MessageBox("該文件不是WAVE格式的文件!");
Return NULL;
}
PCMWAVEFORMAT format; //定義PCMWAVEFORMAT結(jié)構(gòu)對象,用來判斷WAVE文件格式;
mmioSeek(file1,20,SEEK_SET);
//對打開的文件進(jìn)行定位,此時指向WAVE文件的PCMWAVEFORMAT結(jié)構(gòu)的數(shù)據(jù);
mmioRead(file1,(char*)&format,sizeof(PCMWAVEFORMAT));//獲取該結(jié)構(gòu)的數(shù)據(jù);
if(format.wf.nChannels!=2)//判斷是否是立體聲聲音;
{
MessageBox("該聲音文件不是雙通道立體聲文件");
return NULL;
}
mmioSeek(file1,24+sizeof(PCMWAVEFORMAT),SEEK_SET);
//獲取WAVE文件的聲音數(shù)據(jù)的大??;
long size;
mmioRead(file1,(char*)&size,4);
BYTE *pData;
pData=(BYTE*)new char[size];//根據(jù)數(shù)據(jù)的大小申請緩沖區(qū);
mmioSeek(file1,28+sizeof(PCMWAVEFORMAT),SEEK_SET);//對文件重新定位;
mmioRead(file1,(char*)pData,size);//讀取聲音數(shù)據(jù);
mmioClose(file1, MMIO_FHOPEN);//關(guān)閉WAVE文件;
return pData;
}
3、使用MCI方法操作聲音文件
WAVE聲音文件一個最基本的操作就是將文件中的聲音數(shù)據(jù)播放出來,用Windows提供的API函數(shù)BOOL sndPlaySound(LPCSTR lpszSound, UINT fuSound)可以實(shí)現(xiàn)小型WAV文件的播放,其中參數(shù)lpszSound 為所要播放的聲音文件,fuSound為播放聲音文件時所用的標(biāo)志位。例如實(shí)現(xiàn)Sound.wav 文件的異步播放,只要調(diào)用函數(shù)sndPlaySound("c:\windows\Sound.wav",SND_ASYNC)就可以了,由此可以看到sndPlaySound函數(shù)使用是很簡單的。但是當(dāng)WAVE文件大于100K時,這時候系統(tǒng)無法將聲音數(shù)據(jù)一次性的讀入內(nèi)存,sndPlaySound函數(shù)就不能進(jìn)行播放了。為了解決這個問題,你的一個選擇就是用MCI方法來操作聲音文件了。在使用MCI方法之前,首先需要在你開發(fā)的項(xiàng)目設(shè)置Project->Setting->Link->Object/library modules中加入winmm.lib。并在頭文件中包括"mmsystem.h"頭文件。
MicroSoft API提供了MCI(The Media Control Interface)的方法mciSendCommand()和mciSendString()來完成WAVE文件的播放,這里僅介紹mciSendCommand()函數(shù)的使用。
原型:DWORD mciSendCommand(UINT wDeviceID,UINT wMessage,DWORD dwParam1,DWORD dwParam2);
參數(shù):wDeviceID:接受消息的設(shè)備ID;
Message:MCI命令消息;
wParam1:命令的標(biāo)志位;
wParam2:所使用參數(shù)塊的指針
返值:調(diào)用成功,返回零;否則,返回雙字中的低字存放有錯誤信息。
在使用MCI播放聲音文件時,首先要打開音頻設(shè)備,為此要定義MCI_OPEN_PARMS變量 OpenParms,并設(shè)置該結(jié)構(gòu)的相應(yīng)分量:OpenParms.lpstrDeviceType = (LPCSTR) MCI_DEVTYPE_WAVEFORM_AUDIO;//WAVE類型
OpenParms.lpstrElementName = (LPCSTR) Filename;//打開的聲音文件名;
OpenParms.wDeviceID = 0;//打開的音頻設(shè)備的ID
mciSendCommand (NULL, MCI_OPEN,MCI_WAIT | MCI_OPEN_TYPE | MCI_OPEN_TYPE_ID | MCI_OPEN_ELEMENT, (DWORD)(LPVOID) &OpenParms)函數(shù)調(diào)用發(fā)送MCI_OPEN命令后,返回的參數(shù) OpenParms中成員變量的wDeviceID指明打開了哪個設(shè)備。需要關(guān)閉音頻設(shè)備時只要調(diào)用mciSendCommand (m_wDeviceID, MCI_CLOSE, NULL, NULL)就可以了。
播放WAVE文件時,需要定義MCI_PLAY_PARMS變量PlayParms,對該變量進(jìn)行如下設(shè)置:PlayParms.dwFrom = 0,這是為了指定從什么地方(時間)播放WAVE文件,設(shè)置好以后,調(diào)用函數(shù)mciSendCommand (m_wDeviceID, MCI_PLAY,MCI_FROM, (DWORD)(LPVOID)&PlayParms));就實(shí)現(xiàn)了WAVE聲音文件的播放。
另外,調(diào)用mciSendCommand (m_wDeviceID, MCI_PAUSE, 0,(DWORD)(LPVOID)&PlayParms)實(shí)現(xiàn)了暫停功能。調(diào)用mciSendCommand (m_wDeviceID, MCI_STOP, NULL, NULL)實(shí)現(xiàn)停止功能等,可以看出,這些不同的功能實(shí)現(xiàn)都是依靠參數(shù)"Message"取不同的值來實(shí)現(xiàn)的。 不同的Message和dwParam1、dwParam2的組合還可以實(shí)現(xiàn)文件的 跳躍功能。如下面的代碼實(shí)現(xiàn)了跳轉(zhuǎn)到WAVE文件末端的操作:mciSendCommand (m_wDeviceID, MCI_SEEK, MCI_SEEK_TO_END, NULL)。
4、DirectSound操作WAVE文件的方法
MCI雖然調(diào)用簡單,功能強(qiáng)大,可以滿足聲音文件處理的基本需要,但是MCI也有它的缺點(diǎn),那就是它一次只能播放一個WAVE文件,有時在實(shí)際應(yīng)用中,為了實(shí)現(xiàn)混音效果,需要同時播放兩個或兩個以上的WAVE文件時,就需要使用微軟DirectX技術(shù)中的DirectSound了,該技術(shù)直接操作底層聲卡設(shè)備,可以實(shí)現(xiàn)八個以上WAV文件的同時播放。
實(shí)現(xiàn)DirectSound需要以下幾個步驟:1.創(chuàng)建及初始化DirectSound;2.設(shè)定應(yīng)用程序的聲音設(shè)備優(yōu)先級別方式,一般為DSSCL_NORMAL;2. 將WAV文件讀入內(nèi)存,找到格式塊、數(shù)據(jù)塊位置及數(shù)據(jù)長度;3.創(chuàng)建聲音緩沖區(qū);4.載入聲音數(shù)據(jù);5.播放及停止:
二、編程步驟
1、 啟動Visual C++6.0生成一個單文檔視圖結(jié)構(gòu)的應(yīng)用程序,將該程序命名為"playsound";
2、 在程序的主菜單中添加"MCI Play"、"PlaySound"菜單,并使用Class Wizard添加相應(yīng)的消息響應(yīng)函說,分別用不同的方法來處理聲音文件;
3、 在程序的"Link"設(shè)置中添加"dsound.lib、dxguid.lib、winmm.lib"庫,程序的視圖類中包含"mmsystem.h"文件,程序的Debug目錄下添加待播放的聲音文件"chimes.wav和sound.wav";
4、 添加代碼,編譯運(yùn)行程序;
三、程序代碼
//////////////////////////////////////////////////// void CPlaysoundView::OnMciplay()//下面的代碼實(shí)現(xiàn)了WAVE聲音文件的播放: { // TODO: Add your command handler code here MCI_OPEN_PARMS mciOpenParms; MCI_PLAY_PARMS PlayParms; mciOpenParms.dwCallback=0; mciOpenParms.lpstrElementName="d:\chimes.wav"; mciOpenParms.wDeviceID=0; mciOpenParms.lpstrDeviceType="waveaudio"; mciOpenParms.lpstrAlias=" "; PlayParms.dwCallback=0; PlayParms.dwTo=0; PlayParms.dwFrom=0; mciSendCommand(NULL,MCI_OPEN,MCI_OPEN_TYPE|MCI_OPEN_ELEMENT,(DWORD)(LPVOID)&mciOpenParms);//打開音頻設(shè)備; mciSendCommand(mciOpenParms.wDeviceID,MCI_PLAY,MCI_WAIT,(DWORD)(LPVOID)&PlayParms);//播放WAVE聲音文件; mciSendCommand(mciOpenParms.wDeviceID,MCI_CLOSE,NULL,NULL);//關(guān)閉音頻設(shè)備; } ////////////////////////////////////////////////////////////////////////////// /*下面的函數(shù)利用DirectSound技術(shù)實(shí)現(xiàn)了一個WAVE聲音文件的播放(注意項(xiàng)目設(shè)置中要包含"dsound.lib、dxguid.lib"的內(nèi)容),代碼和注釋如下:*/ void CPlaysoundView::OnPlaySound() { // TODO: Add your command handler code here LPVOID lpPtr1;//指針1; LPVOID lpPtr2;//指針2; HRESULT hResult; DWORD dwLen1,dwLen2; LPVOID m_pMemory;//內(nèi)存指針; LPWAVEFORMATEX m_pFormat;//LPWAVEFORMATEX變量; LPVOID m_pData;//指向語音數(shù)據(jù)塊的指針; DWORD m_dwSize;//WAVE文件中語音數(shù)據(jù)塊的長度; CFile File;//Cfile對象; DWORD dwSize;//存放WAV文件長度; //打開sound.wav文件; if (!File.Open ("d://sound.wav", CFile::modeRead |CFile::shareDenyNone)) return ; dwSize = File.Seek (0, CFile::end);//獲取WAVE文件長度; File.Seek (0, CFile::begin);//定位到打開的WAVE文件頭; //為m_pMemory分配內(nèi)存,類型為LPVOID,用來存放WAVE文件中的數(shù)據(jù); m_pMemory = GlobalAlloc (GMEM_FIXED, dwSize); if (File.ReadHuge (m_pMemory, dwSize) != dwSize)//讀取文件中的數(shù)據(jù); { File.Close (); return ; } File.Close (); LPDWORD pdw,pdwEnd; DWORD dwRiff,dwType, dwLength; if (m_pFormat) //格式塊指針 m_pFormat = NULL; if (m_pData) //數(shù)據(jù)塊指針,類型:LPBYTE m_pData = NULL; if (m_dwSize) //數(shù)據(jù)長度,類型:DWORD m_dwSize = 0; pdw = (DWORD *) m_pMemory; dwRiff = *pdw++; dwLength = *pdw++; dwType = *pdw++; if (dwRiff != mmioFOURCC ('R', 'I', 'F', 'F')) return ;//判斷文件頭是否為"RIFF"字符; if (dwType != mmioFOURCC ('W', 'A', 'V', 'E')) return ;//判斷文件格式是否為"WAVE"; //尋找格式塊,數(shù)據(jù)塊位置及數(shù)據(jù)長度 pdwEnd = (DWORD *)((BYTE *) m_pMemory+dwLength -4); bool m_bend=false; while ((pdw < pdwEnd)&&(!m_bend)) //pdw文件沒有指到文件末尾并且沒有獲取到聲音數(shù)據(jù)時繼續(xù); { dwType = *pdw++; dwLength = *pdw++; switch (dwType) { case mmioFOURCC('f', 'm', 't', ' ')://如果為"fmt"標(biāo)志; if (!m_pFormat)//獲取LPWAVEFORMATEX結(jié)構(gòu)數(shù)據(jù); { if (dwLength < sizeof (WAVEFORMAT)) return ; m_pFormat = (LPWAVEFORMATEX) pdw; } break; case mmioFOURCC('d', 'a', 't', 'a')://如果為"data"標(biāo)志; if (!m_pData || !m_dwSize) { m_pData = (LPBYTE) pdw;//得到指向聲音數(shù)據(jù)塊的指針; m_dwSize = dwLength;//獲取聲音數(shù)據(jù)塊的長度; if (m_pFormat) m_bend=TRUE; } break; } pdw = (DWORD *)((BYTE *) pdw + ((dwLength + 1)&~1));//修改pdw指針,繼續(xù)循環(huán); } DSBUFFERDESC BufferDesc;//定義DSUBUFFERDESC結(jié)構(gòu)對象; memset (&BufferDesc, 0, sizeof (BufferDesc)); BufferDesc.lpwfxFormat = (LPWAVEFORMATEX)m_pFormat; BufferDesc.dwSize = sizeof (DSBUFFERDESC); BufferDesc.dwBufferBytes = m_dwSize; BufferDesc.dwFlags = 0; HRESULT hRes; LPDIRECTSOUND m_lpDirectSound; hRes = ::DirectSoundCreate(0, &m_lpDirectSound, 0);//創(chuàng)建DirectSound對象; if( hRes != DS_OK ) return; m_lpDirectSound->SetCooperativeLevel(this->GetSafeHwnd(), DSSCL_NORMAL); //設(shè)置聲音設(shè)備優(yōu)先級別為"NORMAL"; //創(chuàng)建聲音數(shù)據(jù)緩沖; LPDIRECTSOUNDBUFFER m_pDSoundBuffer; if (m_lpDirectSound->CreateSoundBuffer (&BufferDesc, &m_pDSoundBuffer, 0) == DS_OK) //載入聲音數(shù)據(jù),這里使用兩個指針lpPtr1,lpPtr2來指向DirectSoundBuffer緩沖區(qū)的數(shù)據(jù),這是為了處理大型WAVE文件而設(shè)計(jì)的。dwLen1,dwLen2分別對應(yīng)這兩個指針?biāo)赶虻木彌_區(qū)的長度。 hResult=m_pDSoundBuffer->Lock(0,m_dwSize,&lpPtr1,&dwLen1,&lpPtr2,&dwLen2,0); if (hResult == DS_OK) { memcpy (lpPtr1, m_pData, dwLen1); if(dwLen2>0) { BYTE *m_pData1=(BYTE*)m_pData+dwLen1; m_pData=(void *)m_pData1; memcpy(lpPtr2,m_pData, dwLen2); } m_pDSoundBuffer->Unlock (lpPtr1, dwLen1, lpPtr2, dwLen2); } DWORD dwFlags = 0; m_pDSoundBuffer->Play (0, 0, dwFlags); //播放WAVE聲音數(shù)據(jù); } |
音樂就是一系列的音符,這些音符在不同的時間用不同的幅度被播放或者停止。有非常多的指令被用來播放音樂,但是這些指令的操作基本相同,都在使用各種各樣不同的音符。在計(jì)算機(jī)上進(jìn)行作曲,實(shí)際上是存儲了很多組音樂,回放時由音頻硬件將這些音符播放出來。
Midi格式(文件擴(kuò)展名是.MID)是存儲數(shù)字音樂的標(biāo)準(zhǔn)格式。
DirectMusic 音樂片段(music segments)使用.SGT文件擴(kuò)展名,其他的相關(guān)文件包括樂隊(duì)文件(band file .BND),這種文件里面包含樂器信息;弦映射表文件(chordmaps file .CDM)包含在回放時修改音樂的和弦指令;樣式文件(styles file .STY)包含回放樣式信息;模板文件(templates file .TPL)包含創(chuàng)造音樂片段的模板。
Midi是一種非常強(qiáng)大的音樂格式,惟一的不利因素是音樂品質(zhì)依賴于音樂合成器的性能,因?yàn)镸idi 僅僅記錄了音符,其播放的品質(zhì)由播放音樂的軟硬件決定。MP3文件(文件后綴為.MP3)是一種類似于波表文件的文件格式,但是MP3文件和WAV文件最 大的區(qū)別在于MP3文件將聲音壓縮到了最小的程度,但是音質(zhì)卻基本不變。可以用DirectShow組件播放MP3文件,DirectShow組件是一個 非常強(qiáng)大的多媒體組件,用DirectShow幾乎可以播放任何媒體文件,包括聲音和音頻文件,部分聲音文件我們只能用DirectShow播放。
Direct Audio是一個復(fù)合組件,它由DirectSound和DirectMusic兩個組件組成,如下圖所示:
DirectMusic在DirectX8中得到了巨大的增強(qiáng),但是DirectSound基本保持原有的狀態(tài)。DirectSound是主要的數(shù) 字聲音回放組件。DirectMusic處理所有的樂曲格式,包括MIDI、DirectMusic本地格式文件和波表文件。DirectMusic處理 完之后將它們送入DirectSound中做其他處理,這意味著回放MIDI的時候可以使用數(shù)字化的樂器。
使用DirectSound
使用時需要創(chuàng)建一個和聲卡通訊的COM對象,用這個COM對象再創(chuàng)造一些獨(dú)立的聲音數(shù)據(jù)緩沖區(qū)(被稱之為輔助音頻緩沖區(qū) secondary sound buffers)來存儲音頻數(shù)據(jù)。緩沖區(qū)中的這些數(shù)據(jù)在主混音緩存(稱之為主音頻緩存 primary sound buffer)中被混合,然后可以用指定的任何格式播放出來?;胤鸥袷酵ㄟ^采樣頻率、聲道數(shù)、采樣精度排列,可能的采樣頻率有8000HZ, 11025HZ,22050HZ和44100HZ(CD音質(zhì))。
對于聲道數(shù)可以有兩個選擇:單通道的單聲道聲音和雙通道的立體聲聲音。采樣精度被限制在兩種選擇上:8位的低質(zhì)量聲音和16位的高保真聲音。在沒有修改的 情況下,DirectSound主緩沖區(qū)的默認(rèn)設(shè)置是22025HZ采樣率、8位精度、立體聲。在DirectSound中可以調(diào)整聲音的播放速度(這同 樣會改變聲音的音調(diào)),調(diào)整音量 、循環(huán)播放等。甚至還可以在一個虛擬的 3D環(huán)境中播放,以模擬一個實(shí)際環(huán)繞在周圍的聲音。
需要做的是將聲音數(shù)據(jù)充滿緩沖區(qū),如果聲音數(shù)據(jù)太大的話,必須創(chuàng)建流播放方法,加載聲音數(shù)據(jù)中的一小塊,當(dāng)這一小塊播放完畢以后,再加載另外的小塊數(shù)據(jù)進(jìn) 緩沖區(qū),一直持續(xù)這個過程,直到聲音被處理完畢。在緩沖區(qū)中調(diào)整播放位置可以實(shí)現(xiàn)流式音頻,當(dāng)播放完成通知應(yīng)用程序更新音頻數(shù)據(jù)。這個通知更新的過程我們 稱之為“通告”。在同一時間被播放的緩存數(shù)目雖然沒有限制,但是仍然需要保證緩沖區(qū)數(shù)目不要太多,因?yàn)槊吭黾右粋€緩沖區(qū),就要消耗很多內(nèi)存和CPU資源。
在項(xiàng)目中使用DirectSound和DirectMusic,需要添加頭文件dsound.h和dmsuic.h,并且需要鏈接DSound.lib到包含庫中,添加DXGuid.lib庫可以讓DirectSound更容易使用。
以下是DirectSound COM接口:
IDirectSound8:DirectSound接口。
IDirectSoundBuffer8:主緩沖區(qū)和輔助緩沖區(qū)接口,保存數(shù)據(jù)并控制回放。
IDirectSoundNotify8:通知對象,通知應(yīng)用程序指定播放位置已經(jīng)達(dá)到。
各個對象間的關(guān)系如下圖所示:
The DirectSoundCreate8 function creates and initializes an object that supports the IDirectSound8 interface.
HRESULT DirectSoundCreate8(
LPCGUID lpcGuidDevice,
LPDIRECTSOUND8 * ppDS8,
LPUNKNOWN pUnkOuter
);
Value | Description |
---|---|
DSDEVID_DefaultPlayback | System-wide default audio playback device. Equivalent to NULL. |
DSDEVID_DefaultVoicePlayback | Default voice playback device. |
If the function succeeds, it returns DS_OK. If it fails, the return value may be one of the following.
Return Code |
---|
DSERR_ALLOCATED |
DSERR_INVALIDPARAM |
DSERR_NOAGGREGATION |
DSERR_NODRIVER |
DSERR_OUTOFMEMORY |
The application must call the IDirectSound8::SetCooperativeLevel method immediately after creating a device object.
創(chuàng)建主音頻緩沖區(qū)
The CreateSoundBuffer method creates a sound buffer object to manage audio samples.
HRESULT CreateSoundBuffer(
LPCDSBUFFERDESC pcDSBufferDesc,
LPDIRECTSOUNDBUFFER * ppDSBuffer,
LPUNKNOWN pUnkOuter
);
If the method succeeds, the return value is DS_OK, or DS_NO_VIRTUALIZATION if a requested 3D algorithm was not available and stereo panning was substituted. See the description of the guid3DAlgorithm member of DSBUFFERDESC. If the method fails, the return value may be one of the error values shown in the following table.
Return code |
DSERR_ALLOCATED |
DSERR_BADFORMAT |
DSERR_BUFFERTOOSMALL |
DSERR_CONTROLUNAVAIL |
DSERR_DS8_REQUIRED |
DSERR_INVALIDCALL |
DSERR_INVALIDPARAM |
DSERR_NOAGGREGATION |
DSERR_OUTOFMEMORY |
DSERR_UNINITIALIZED |
DSERR_UNSUPPORTED |
DirectSound does not initialize the contents of the buffer, and the application cannot assume that it contains silence.
If an attempt is made to create a buffer with the DSBCAPS_LOCHARDWARE flag on a system where hardware acceleration is not available, the method fails with either DSERR_CONTROLUNAVAIL or DSERR_INVALIDCALL, depending on the operating system.
The DSBUFFERDESC structure describes the characteristics of a new buffer object. It is used by the IDirectSound8::CreateSoundBuffer method and by the DirectSoundFullDuplexCreate8 function.
An earlier version of this structure, DSBUFFERDESC1, is maintained in Dsound.h for compatibility with DirectX 7 and earlier.
typedef struct DSBUFFERDESC {
DWORD dwSize;
DWORD dwFlags;
DWORD dwBufferBytes;
DWORD dwReserved;
LPWAVEFORMATEX lpwfxFormat;
GUID guid3DAlgorithm;
} DSBUFFERDESC;
Value | Description | Availability |
---|---|---|
DS3DALG_DEFAULT | DirectSound uses the default algorithm. In most cases this is DS3DALG_NO_VIRTUALIZATION. On WDM drivers, if the user has selected a surround sound speaker configuration in Control Panel, the sound is panned among the available directional speakers. | Applies to software mixing only. Available on WDM or Vxd Drivers. |
DS3DALG_NO_VIRTUALIZATION | 3D output is mapped onto normal left and right stereo panning. At 90 degrees to the left, the sound is coming out of only the left speaker; at 90 degrees to the right, sound is coming out of only the right speaker. The vertical axis is ignored except for scaling of volume due to distance. Doppler shift and volume scaling are still applied, but the 3D filtering is not performed on this buffer. This is the most efficient software implementation, but provides no virtual 3D audio effect. When the DS3DALG_NO_VIRTUALIZATION algorithm is specified, HRTF processing will not be done. Because DS3DALG_NO_VIRTUALIZATION uses only normal stereo panning, a buffer created with this algorithm may be accelerated by a 2D hardware voice if no free 3D hardware voices are available. | Applies to software mixing only. Available on WDM or Vxd Drivers. |
DS3DALG_HRTF_FULL | The 3D API is processed with the high quality 3D audio algorithm. This algorithm gives the highest quality 3D audio effect, but uses more CPU cycles. See Remarks. | Applies to software mixing only. Available on Microsoft Windows 98 Second Edition and later operating systems when using WDM drivers. |
DS3DALG_HRTF_LIGHT | The 3D API is processed with the efficient 3D audio algorithm. This algorithm gives a good 3D audio effect, but uses fewer CPU cycles than DS3DALG_HRTF_FULL. | Applies to software mixing only. Available on Windows 98 Second Edition and later operating systems when using WDM drivers. |
Value | Description |
---|---|
DSBCAPS_CTRL3D | The buffer has 3D control capability. |
DSBCAPS_CTRLFREQUENCY | The buffer has frequency control capability. |
DSBCAPS_CTRLFX | The buffer supports effects processing. |
DSBCAPS_CTRLPAN | The buffer has pan control capability. |
DSBCAPS_CTRLVOLUME | The buffer has volume control capability. |
DSBCAPS_CTRLPOSITIONNOTIFY | The buffer has position notification capability. See the Remarks for DSCBUFFERDESC. |
DSBCAPS_GETCURRENTPOSITION2 | The buffer uses the new behavior of the play cursor when IDirectSoundBuffer8::GetCurrentPosition is called. In the first version of DirectSound, the play cursor was significantly ahead of the actual playing sound on emulated sound cards; it was directly behind the write cursor. Now, if the DSBCAPS_GETCURRENTPOSITION2 flag is specified, the application can get a more accurate play cursor. If this flag is not specified, the old behavior is preserved for compatibility. This flag affects only emulated devices; if a DirectSound driver is present, the play cursor is accurate for DirectSound in all versions of DirectX. |
DSBCAPS_GLOBALFOCUS | The buffer is a global sound buffer. With this flag set, an application using DirectSound can continue to play its buffers if the user switches focus to another application, even if the new application uses DirectSound. The one exception is if you switch focus to a DirectSound application that uses the DSSCL_WRITEPRIMARY flag for its cooperative level. In this case, the global sounds from other applications will not be audible. |
DSBCAPS_LOCDEFER | The buffer can be assigned to a hardware or software resource at play time, or when IDirectSoundBuffer8::AcquireResources is called. |
DSBCAPS_LOCHARDWARE | The buffer uses hardware mixing. |
DSBCAPS_LOCSOFTWARE | The buffer is in software memory and uses software mixing. |
DSBCAPS_MUTE3DATMAXDISTANCE | The sound is reduced to silence at the maximum distance. The buffer will stop playing when the maximum distance is exceeded, so that processor time is not wasted. Applies only to software buffers. |
DSBCAPS_PRIMARYBUFFER | The buffer is a primary buffer. |
DSBCAPS_STATIC | The buffer is in on-board hardware memory. |
DSBCAPS_STICKYFOCUS | The buffer has sticky focus. If the user switches to another application not using DirectSound, the buffer is still audible. However, if the user switches to another DirectSound application, the buffer is muted. |
DSBCAPS_TRUEPLAYPOSITION | Force IDirectSoundBuffer8::GetCurrentPosition to return the buffer's true play position. This flag is only valid in Windows Vista. |
The SetFormat method sets the format of the primary buffer. Whenever this application has the input focus, DirectSound will set the primary buffer to the specified format.
HRESULT SetFormat(
LPCWAVEFORMATEX pcfxFormat
);
If the method succeeds, the return value is DS_OK. If the method fails, the return value may be one of the following error values:
Return code |
DSERR_BADFORMAT |
DSERR_INVALIDCALL |
DSERR_INVALIDPARAM |
DSERR_OUTOFMEMORY |
DSERR_PRIOLEVELNEEDED |
DSERR_UNSUPPORTED |
The format of the primary buffer should be set before secondary buffers are created.
The method fails if the application has the DSSCL_NORMAL cooperative level.
If the application is using DirectSound at the DSSCL_WRITEPRIMARY cooperative level, and the format is not supported, the method fails.
If the cooperative level is DSSCL_PRIORITY, DirectSound stops the primary buffer, changes the format, and restarts the buffer. The method succeeds even if the hardware does not support the requested format; DirectSound sets the buffer to the closest supported format. To determine whether this has happened, an application can call the GetFormat method for the primary buffer and compare the result with the format that was requested with the SetFormat method.
This method is not available for secondary sound buffers. If a new format is required, the application must create a new DirectSoundBuffer object.
The WAVEFORMATEX structure defines the format of waveform-audio data. Only format information common to all waveform-audio data formats is included in this structure. For formats that require additional information, this structure is included as the first member in another structure, along with the additional information.
This structure is part of the Platform SDK and is not declared in Dsound.h. It is documented here for convenience.
typedef struct WAVEFORMATEX {
WORD wFormatTag;
WORD nChannels;
DWORD nSamplesPerSec;
DWORD nAvgBytesPerSec;
WORD nBlockAlign;
WORD wBitsPerSample;
WORD cbSize;
} WAVEFORMATEX;
Software must process a multiple of nBlockAlign bytes of data at a time. Data written to and read from a device must always start at the beginning of a block. For example, it is illegal to start playback of PCM data in the middle of a sample (that is, on a non-block-aligned boundary).