๋ฐ์—”์œผ๋กœ ์„ฑ์žฅ์ค‘ ๐ŸŒฑ

Python/[์›น ํฌ๋กค๋ง]

[ํŽธ์˜์  ํฌ๋กค๋ง] MINISTOP

์จ๋ฐ 2023. 3. 24. 12:47

๐Ÿšฉ PLAN

 

 

1. MINISTOP ํŽธ์˜์  ๋งค์žฅ ์ •๋ณด ์›น ํŽ˜์ด์ง€์—์„œ ํŒŒ์ด์ฌ์„ ํ†ตํ•ด ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ ธ์˜จ๋‹ค.

 

2. ๊ฐ€์ ธ์˜จ ๋ฐ์ดํ„ฐ๋ฅผ DataFrame ์œผ๋กœ ๋งŒ๋“ค๊ณ , ๋กœ์ปฌ์— pickle ํŒŒ์ผ๋กœ ์ €์žฅํ•œ๋‹ค.

 

 

 

 

[ ๋Œ€์ƒ ์‚ฌ์ดํŠธ ]

 

https://www.ministop.co.kr/

 

 

 

 


 

 

๋จผ์ €, ๋Œ€์ƒ ์‚ฌ์ดํŠธ์˜ ๊ตฌ์กฐ๋ฅผ ์‚ดํŽด๋ณด์•˜๋‹ค.

 

 

MINISTOP ์˜ ๋ฉ”์ธ ์›น ํŽ˜์ด์ง€์—์„œ ๋งค์žฅ ์ฐพ๊ธฐ ๋ฒ„ํŠผ์„ ๋ˆ„๋ฅด๋ฉด, ์šฐ๋ฆฌ๊ฐ€ ๋ฐ์ดํ„ฐ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋Š” ๋งค์žฅ ์ฐพ๊ธฐ ํŽ˜์ด์ง€๋กœ ์ด๋™ํ•œ๋‹ค.

 

์ž„์˜๋กœ ์•„๋ฌด ์ง€์—ญ์œผ๋กœ ์˜ต์…˜์„ ์„ค์ •ํ•ด์ฃผ๊ณ , ์ฐพ๊ธฐ ๋ฒ„ํŠผ์„ ๋ˆ„๋ฅด๋ฉด ์˜ค๋ฅธ์ชฝ ์ด๋ฏธ์ง€์™€ ๊ฐ™์€ api ๋ฅผ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

payload ๋ฅผ ์‚ดํŽด๋ณด๋ฉฐ ๋ณ€๊ฒฝํ•ด์ฃผ์–ด์•ผ ํ•  key ๋ฅผ ํ™•์ธํ•ด๋ณธ๋‹ค.

 

์—ฌ๊ธฐ์„œ, payload ์˜ ํŠน์ง•์— paramInfo key ์— ๋Œ€ํ•ด ๋” ์•Œ์•„๋ณด๊ธฐ ์œ„ํ•ด ๋‹ค๋ฅธ ์‹œ/๋„์™€ ์‹œ/๊ตฐ/๊ตฌ๋ฅผ ์„ค์ •ํ•ด์ฃผ๊ณ  ๊ฒ€์ƒ‰ํ•ด๋ณด์•˜๋‹ค.

 

 

๋น„๊ตํ•ด๋ณด๋ฉด paramInfo key ๊ฐ€ ์‹œ/๋„์™€ ์‹œ/๊ตฐ/๊ตฌ ๊ฐ’์„ ์˜๋ฏธํ•จ์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์—ˆ๋‹ค.

 

๊ทธ๋ฆฌ๊ณ , Preview๋ฅผ ํ†ตํ•ด ์–ด๋–ค ์‹์œผ๋กœ ์ •๋ณด๋ฅผ ๋ฐ›์„ ์ˆ˜ ์žˆ๋Š”์ง€ ๋ฏธ๋ฆฌ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

 

 

๊น”๋”ํ•˜๊ฒŒ ํ•„์š”ํ•œ ๋ฐ์ดํ„ฐ๋“ค์„ ๋”•์…”๋„ˆ๋ฆฌ ํ˜•ํƒœ๋กœ ์ž˜ ๋งคํ•‘๋˜์–ด ์žˆ๋Š” ๊ฒƒ์„ ๋ณผ ์ˆ˜ ์žˆ๋‹ค.

๋ฐ์ดํ„ฐ๋ฅผ ์š”์ฒญํ•ด์„œ ๋ฐ›๊ฒŒ ๋˜๋ฉด, ์œ„์™€ ๊ฐ™์ด dictionary ํ˜•ํƒœ, ์ฆ‰ json ์„ ์‚ฌ์šฉํ•ด์„œ ์ •๋ณด๋ฅผ ๋ฐ›์„ ์ˆ˜ ์žˆ๊ฒ ๋‹ค๊ณ  ์ƒ๊ฐํ–ˆ๋‹ค.

 

 

๋‚˜๋Š” ์ด๋Ÿฐ flow ๋กœ ์ ‘๊ทผํ•˜์—ฌ ์•„๋ž˜์™€ ๊ฐ™์€ ์ฝ”๋“œ๋กœ ์ž‘์—…ํ–ˆ๋‹ค.

 

 

โŒจ๏ธ Code

 

import requests
import json
import pickle
import pandas as pd
from tqdm import tqdm
from bs4 import BeautifulSoup as BS


mini_url = "https://www.ministop.co.kr/MiniStopHomePage/page/querySimple.do"

# cf. 'tm' ์˜ value ๊ฐ’์ด ์—†์–ด๋„ ๋œ๋‹ค. ์ฆ‰, ์„ค์ • ์•ˆํ•ด์ค˜๋„ ๋จ
mini_pay = {
    "pageId": "store/store",
    "sqlnum": "2",
    "paramInfo": "1:",
    "pageNum": "1",
    "sortGu": "",
    "tm": "1677986477478",
}

mini_name = []
mini_address = []
mini_phone = []
mini_ice = []
mini_lat = []
mini_lon = []

store_df = pd.DataFrame()

# 1๋ถ€ํ„ฐ 16๊นŒ์ง€ ์žˆ์Œ
for idx, x in enumerate(range(1, 17)): 
    mini_pay['sqlnum'] = 2
    mini_pay["paramInfo"] = "{}:".format(x)
    r = requests.post(mini_url, data=mini_pay) # json ํ˜•ํƒœ๋กœ ์ €์žฅ๋œ ๊ฑฐ ๋ณผ ์ˆ˜ ์žˆ์Œ
    
    num_gu = [] # number ์™€ gu ์ด๋ฆ„ ์ €์žฅ
    
    for y in r.json()['recordList']:
        num_gu.append(y['fields'])  
    
    # ๋ฐ์ดํ„ฐ ๋ฝ‘์•„์˜ค๊ธฐ
    mini_pay['sqlnum'] = 3
    
    for idx, z in enumerate(num_gu):

        mini_pay["paramInfo"] = "{}:{}:{}:".format(x, z[0], z[0])
        
        r = requests.post(mini_url, data=mini_pay)
        
        for idx, t in enumerate(r.json()['recordList']):
            
            mini_name.append(t['fields'][0])
            mini_address.append(t['fields'][1])
            mini_phone.append(t['fields'][2])
            mini_ice.append(t['fields'][3])
            mini_lat.append(t['fields'][4])
            mini_lon.append(t['fields'][5])


store_df['๋งค์žฅ๋ช…'] = mini_name
store_df['์ฃผ์†Œ'] = mini_address
store_df['์—ฐ๋ฝ์ฒ˜'] = mini_phone
store_df['์†Œํ”„ํŠธํฌ๋ฆผ'] = mini_ice
store_df['์œ„๋„'] = mini_lat
store_df['๊ฒฝ๋„']= mini_lon

store_df.to_pickle('./ministop_store_soft_latlon.pkl')

 

 

๐Ÿ“‹ DataFrame

 

์œ„ ์ฝ”๋“œ๋ฅผ ํ†ตํ•ด, ์ตœ์ข… DataFrame ์€ ์•„๋ž˜์™€ ๊ฐ™์€ ํ˜•ํƒœ๋กœ ์ €์žฅํ•  ์ˆ˜ ์žˆ๋‹ค.