有一個(gè)50列的表格,里面都是英文,要翻譯成中文:
在ChatGPT中輸入提示詞:
你是一個(gè)開發(fā)AI大模型應(yīng)用的Python編程專家,要完成以下任務(wù)的Python腳本:
打開Excel文件:"F:\AI自媒體內(nèi)容\AI行業(yè)數(shù)據(jù)分析\poetop50bots.xlsx"
讀取A2到B51這個(gè)區(qū)域中的每一個(gè)單元格內(nèi)容,
調(diào)用deepseek-chat模型(上下文長(zhǎng)度32K,最大輸出長(zhǎng)度4K)來將單元格的內(nèi)容翻譯成中文;
模型的base_url為:https://api.deepseek.com
模型的api_key為:XXX
temperature 參數(shù)設(shè)置認(rèn)為 1.1
提示詞為:把英文內(nèi)容翻譯為中文
調(diào)用deepseek-chat模型API的示例參照【】里面的內(nèi)容:
【# Please install OpenAI SDK first:`pip3 install openai`
from openai import OpenAI
client = OpenAI(api_key="<deepseek api key>", base_url="https://api.deepseek.com")
response = client.chat.completions.create(
model="deepseek-chat",
messages=[
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Hello"},
],
stream=False
)
print(response.choices[0].message.content)
模型返回的json數(shù)據(jù)示例:
{
"id": "65c327b06948c8d635c8316c6885d95e",
"choices": [
{
"index": 0,
"message": {
"content": "Hello! How can I assist you with your programming or computer science questions today?",
"role": "assistant"
},
"finish_reason": "stop",
"logprobs": null
}
],
"created": 1717069572,
"model": "deepseek-coder",
"system_fingerprint": "fp_ded2115e5a",
"object": "chat.completion",
"usage": {
"prompt_tokens": 18,
"completion_tokens": 16,
"total_tokens": 34
}
}
】
解析返回的JSON數(shù)據(jù),獲取翻譯后的文本,將翻譯后的文本內(nèi)容寫入一個(gè)新的表格文件中;
注意:
每一步都要輸出相關(guān)信息到屏幕上
如果單元格的文本長(zhǎng)度或者翻譯返回的文本長(zhǎng)度超過模型限制,那就進(jìn)行拆分,然后組合在一起;
Chatpgt生成的代碼:
import pandas as pd
import requests
import json
# 讀取Excel文件
file_path = "F:\\AI自媒體內(nèi)容\\AI行業(yè)數(shù)據(jù)分析\\poetop50bots.xlsx"
df = pd.read_excel(file_path, header=None, usecols="A:B", skiprows=1, nrows=50)
# DeepSeek API配置信息
base_url = "https://api.deepseek.com"
api_key = "XXX"
model = "deepseek-chat"
def translate_text(text):
headers = {
'Authorization': f'Bearer {api_key}',
'Content-Type': 'application/json'
}
data = {
"model": model,
"messages": [
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": f"把英文內(nèi)容翻譯為中文:{text}"}
],
"stream": False
}
response = requests.post(f"{base_url}/v1/chat/completions", headers=headers, json=data)
response.raise_for_status() # 檢查請(qǐng)求是否成功
translated_text = response.json()['choices'][0]['message']['content']
return translated_text
def split_text(text, max_length):
return [text[i:i + max_length] for i in range(0, len(text), max_length)]
# 輸出每一步的信息到屏幕上
print(f"讀取文件: {file_path}")
for index, row in df.iterrows():
for col in [0, 1]: # 遍歷A和B列
text = row[col]
if pd.isna(text):
continue
text = str(text)
print(f"正在翻譯單元格({index+2}, {chr(col+65)}): {text}")
max_input_length = 32000
max_output_length = 4000
if len(text) > max_input_length:
parts = split_text(text, max_input_length)
translated_parts = [translate_text(part) for part in parts]
translated_text = ''.join(translated_parts)
else:
translated_text = translate_text(text)
print(f"翻譯結(jié)果: {translated_text}")
# 將翻譯后的文本寫入DataFrame中
df.iat[index, col] = translated_text
# 將結(jié)果寫回到新的Excel文件中
new_file_path = "F:\\AI自媒體內(nèi)容\\AI行業(yè)數(shù)據(jù)分析\\translated_poetop50bots.xlsx"
df.to_excel(new_file_path, index=False, header=False, startrow=1)
print(f"翻譯完成,結(jié)果已寫入文件: {new_file_path}")
聯(lián)系客服