본문 바로가기
TIL/웹 초보

23.04.14

by J1-H00N 2023. 4. 14.

meta tag를 활용하여 크롤링하기 

meta tag들은 약속된 형태가 많기 때문에, 어떤 사이트에서든 공통된 정보를 가져올 수 있다.

import requests
from bs4 import BeautifulSoup

URL = 'https://movie.daum.net/moviedb/main?movieId=161806'
headers = {'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36'}
data = requests.get(URL,headers=headers)
soup = BeautifulSoup(data.text, 'html.parser')

ogtitle = soup.select_one('meta[property="og:title"]')['content'] # meta의 property="og:title"에서 content부분 가져오기
ogimage = soup.select_one('meta[property="og:image"]')['content'] # meta의 property="og:image"에서 content부분 가져오기
ogdesc = soup.select_one('meta[property="og:description"]')['content'] # meta의 property="og:description"에서 content부분 가져오기
print(ogtitle, ogimage, ogdesc)

 

 

지금까지 배운 내용들을 이용해서 작은 프로젝트 만들기

 

기본 세팅

app.py 만들기 -> python -m venv venv -> 새폴더(templates) 만들기 -> index.html 만들기 -> $ pip install flask pymongo dnspython requests bs4

app.py 기본 뼈대

from flask import Flask, render_template, request, jsonify
app = Flask(__name__)

@app.route('/')
def home():
    return render_template('index.html')

@app.route("/movie", methods=["POST"])
def movie_post():
    sample_receive = request.form['sample_give']
    print(sample_receive)
    return jsonify({'msg':'POST 연결 완료!'})

@app.route("/movie", methods=["GET"])
def movie_get():
    return jsonify({'msg':'GET 연결 완료!'})

if __name__ == '__main__':
    app.run('0.0.0.0', port=5000, debug=True)

index.html 기본 뼈대

<!doctype html>
<html lang="en">

<head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">

    <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.2/dist/css/bootstrap.min.css" rel="stylesheet"
          integrity="sha384-EVSTQN3/azprG1Anm3QDgpJLIm9Nao0Yz1ztcQTwFspd3yD65VohhpuuCOmLASjC" crossorigin="anonymous">
    <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
    <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.0.2/dist/js/bootstrap.bundle.min.js"
            integrity="sha384-MrcW6ZMFYlzcLA8Nl+NtUVF0sA7MsXsP1UyJoMp4YLEuNSfAP+JcXn/tWtIaxVXM"
            crossorigin="anonymous"></script>

    <title>스파르타 피디아</title>

    <link href="https://fonts.googleapis.com/css2?family=Gowun+Dodum&display=swap" rel="stylesheet">

    <style>
        * {
            font-family: 'Gowun Dodum', sans-serif;
        }

        .mytitle {
            width: 100%;
            height: 250px;

            background-image: linear-gradient(0deg, rgba(0, 0, 0, 0.5), rgba(0, 0, 0, 0.5)), url('https://movie-phinf.pstatic.net/20210715_95/1626338192428gTnJl_JPEG/movie_image.jpg');
            background-position: center;
            background-size: cover;

            color: white;

            display: flex;
            flex-direction: column;
            align-items: center;
            justify-content: center;
        }

        .mytitle > button {
            width: 200px;
            height: 50px;

            background-color: transparent;
            color: white;

            border-radius: 50px;
            border: 1px solid white;

            margin-top: 10px;
        }

        .mytitle > button:hover {
            border: 2px solid white;
        }

        .mycomment {
            color: gray;
        }

        .mycards {
            margin: 20px auto 0px auto;
            width: 95%;
            max-width: 1200px;
        }

        .mypost {
            width: 95%;
            max-width: 500px;
            margin: 20px auto 0px auto;
            padding: 20px;
            box-shadow: 0px 0px 3px 0px gray;

            display: none;
        }

        .mybtns {
            display: flex;
            flex-direction: row;
            align-items: center;
            justify-content: center;

            margin-top: 20px;
        }
        .mybtns > button {
            margin-right: 10px;
        }
    </style>
    <script>
        $(document).ready(function(){
          listing();
        });

        function listing() {
            fetch('/movie').then((res) => res.json()).then((data) => {
            console.log(data)
            alert(data['msg'])
            })
        }

        function posting() {
						let formData = new FormData();
            formData.append("sample_give", "샘플데이터");

            fetch('/movie', {method : "POST",body : formData}).then((res) => res.json()).then((data) => {
            console.log(data)
            alert(data['msg'])
            })
        }

        function open_box(){
            $('#post-box').show()
        }
        function close_box(){
            $('#post-box').hide()
        }
    </script>
</head>

<body>
<div class="mytitle">
    <h1>내 생애 최고의 영화들</h1>
    <button onclick="open_box()">영화 기록하기</button>
</div>
<div class="mypost" id="post-box">
    <div class="form-floating mb-3">
        <input id="url" type="email" class="form-control" placeholder="name@example.com">
        <label>영화URL</label>
    </div>
    <div class="form-floating">
        <textarea id="comment" class="form-control" placeholder="Leave a comment here"></textarea>
        <label for="floatingTextarea2">코멘트</label>
    </div>
    <div class="mybtns">
        <button onclick="posting()" type="button" class="btn btn-dark">기록하기</button>
        <button onclick="close_box()" type="button" class="btn btn-outline-dark">닫기</button>
    </div>
</div>
<div class="mycards">
    <div class="row row-cols-1 row-cols-md-4 g-4" id="cards-box">
        <div class="col">
            <div class="card h-100">
                <img src="https://movie-phinf.pstatic.net/20210728_221/1627440327667GyoYj_JPEG/movie_image.jpg"
                     class="card-img-top">
                <div class="card-body">
                    <h5 class="card-title">영화 제목이 들어갑니다</h5>
                    <p class="card-text">여기에 영화에 대한 설명이 들어갑니다.</p>
                    <p>⭐⭐⭐</p>
                    <p class="mycomment">나의 한줄 평을 씁니다</p>
                </div>
            </div>
        </div>
        <div class="col">
            <div class="card h-100">
                <img src="https://movie-phinf.pstatic.net/20210728_221/1627440327667GyoYj_JPEG/movie_image.jpg"
                     class="card-img-top">
                <div class="card-body">
                    <h5 class="card-title">영화 제목이 들어갑니다</h5>
                    <p class="card-text">여기에 영화에 대한 설명이 들어갑니다.</p>
                    <p>⭐⭐⭐</p>
                    <p class="mycomment">나의 한줄 평을 씁니다</p>
                </div>
            </div>
        </div>
        <div class="col">
            <div class="card h-100">
                <img src="https://movie-phinf.pstatic.net/20210728_221/1627440327667GyoYj_JPEG/movie_image.jpg"
                     class="card-img-top">
                <div class="card-body">
                    <h5 class="card-title">영화 제목이 들어갑니다</h5>
                    <p class="card-text">여기에 영화에 대한 설명이 들어갑니다.</p>
                    <p>⭐⭐⭐</p>
                    <p class="mycomment">나의 한줄 평을 씁니다</p>
                </div>
            </div>
        </div>
        <div class="col">
            <div class="card h-100">
                <img src="https://movie-phinf.pstatic.net/20210728_221/1627440327667GyoYj_JPEG/movie_image.jpg"
                     class="card-img-top">
                <div class="card-body">
                    <h5 class="card-title">영화 제목이 들어갑니다</h5>
                    <p class="card-text">여기에 영화에 대한 설명이 들어갑니다.</p>
                    <p>⭐⭐⭐</p>
                    <p class="mycomment">나의 한줄 평을 씁니다</p>
                </div>
            </div>
        </div>
    </div>
</div>
</body>

</html>

 

클라이언트에서 url과 코멘트 작성시 원하는 데이터들을 db에 저장하기

 

백엔드 만들기

@app.route("/movie", methods=["POST"])
def movie_post():
    url_receive = request.form['url_give']
    comment_receive = request.form['comment_give']

    headers = {'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36'}
    data = requests.get(url_receive,headers=headers)

    soup = BeautifulSoup(data.text, 'html.parser')

    ogtitle = soup.select_one('meta[property="og:title"]')['content']
    ogimage = soup.select_one('meta[property="og:image"]')['content']
    ogdesc = soup.select_one('meta[property="og:description"]')['content']

    doc = {
        'title' : ogtitle, 
        'desc' : ogdesc, 
        'image' : ogimage,
        'comment' : comment_receive
    }
    db.movies.insert_one(doc)

    return jsonify({'msg':'저장 완료!'})

클라이언트 만들기

function posting() {
            let url = $('#url').val()
            let comment = $('#comment').val()

            let formData = new FormData();
            formData.append("url_give", url);
            formData.append("comment_give", comment);

            fetch('/movie', { method: "POST", body: formData }).then((res) => res.json()).then((data) => {
                alert(data['msg'])
                window.location.reload()
            })
        }

 

DB에서 가져온 데이터들을 카드 형태로 클라이언트에 만들기

마찬가지로 백엔드부터 만들기

@app.route("/movie", methods=["GET"])
def movie_get():
    all_movies = list(db.movise.find({},{'_id':False}))
    return jsonify({'result': all_movies})

그 다음 클라이언트

function listing() {
            fetch('/movie').then((res) => res.json()).then((data) => {
                let rows = data['result']
                $('#cards-box').empty()
                rows.forEach((a) => {
                    let comment = a['comment']
                    let title = a['title']
                    let desc = a['desc']
                    let image = a['image']

                    let temp_html = `<div class="col">
                                        <div class="card h-100">
                                            <img src="${image}"
                                                class="card-img-top">
                                            <div class="card-body">
                                                <h5 class="card-title">${title}</h5>
                                                <p class="card-text">${desc}</p>
                                                <p>⭐⭐⭐</p>
                                                <p class="mycomment">${comment}</p>
                                            </div>
                                        </div>
                                    </div>`
                    $('#cards-box').append(temp_html)
                })
            })
        }

 

별점 기능 추가후 완성 코드

app.py

from flask import Flask, render_template, request, jsonify
app = Flask(__name__)

from pymongo import MongoClient
client = MongoClient('mongodb+srv://sparta:test@cluster0.szcxpef.mongodb.net/?retryWrites=true&w=majority')
db = client.dbsparta

import requests
from bs4 import BeautifulSoup

@app.route('/')
def home():
    return render_template('index.html')

@app.route("/movie", methods=["POST"])
def movie_post():
    url_receive = request.form['url_give']
    comment_receive = request.form['comment_give']
    star_receive = request.form['star_give']

    headers = {'User-Agent' : 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36'}
    data = requests.get(url_receive,headers=headers)

    soup = BeautifulSoup(data.text, 'html.parser')

    ogtitle = soup.select_one('meta[property="og:title"]')['content']
    ogimage = soup.select_one('meta[property="og:image"]')['content']
    ogdesc = soup.select_one('meta[property="og:description"]')['content']

    doc = {
        'title' : ogtitle, 
        'desc' : ogdesc, 
        'image' : ogimage,
        'comment' : comment_receive,
        'star' : star_receive
    }
    db.movies.insert_one(doc)

    return jsonify({'msg':'저장 완료!'})

@app.route("/movie", methods=["GET"])
def movie_get():
    all_movies = list(db.movies.find({},{'_id':False}))
    return jsonify({'result': all_movies})

if __name__ == '__main__':
    app.run('0.0.0.0', port=5000, debug=True)

index.html

<!doctype html>
<html lang="en">

<head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">

    <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.2/dist/css/bootstrap.min.css" rel="stylesheet"
        integrity="sha384-EVSTQN3/azprG1Anm3QDgpJLIm9Nao0Yz1ztcQTwFspd3yD65VohhpuuCOmLASjC" crossorigin="anonymous">
    <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js"></script>
    <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.0.2/dist/js/bootstrap.bundle.min.js"
        integrity="sha384-MrcW6ZMFYlzcLA8Nl+NtUVF0sA7MsXsP1UyJoMp4YLEuNSfAP+JcXn/tWtIaxVXM"
        crossorigin="anonymous"></script>

    <title>스파르타 피디아</title>

    <link href="https://fonts.googleapis.com/css2?family=Gowun+Dodum&display=swap" rel="stylesheet">

    <style>
        * {
            font-family: 'Gowun Dodum', sans-serif;
        }

        .mytitle {
            width: 100%;
            height: 250px;

            background-image: linear-gradient(0deg, rgba(0, 0, 0, 0.5), rgba(0, 0, 0, 0.5)), url('https://movie-phinf.pstatic.net/20210715_95/1626338192428gTnJl_JPEG/movie_image.jpg');
            background-position: center;
            background-size: cover;

            color: white;

            display: flex;
            flex-direction: column;
            align-items: center;
            justify-content: center;
        }

        .mytitle>button {
            width: 200px;
            height: 50px;

            background-color: transparent;
            color: white;

            border-radius: 50px;
            border: 1px solid white;

            margin-top: 10px;
        }

        .mytitle>button:hover {
            border: 2px solid white;
        }

        .mycomment {
            color: gray;
        }

        .mycards {
            margin: 20px auto 0px auto;
            width: 95%;
            max-width: 1200px;
        }

        .mypost {
            width: 95%;
            max-width: 500px;
            margin: 20px auto 0px auto;
            padding: 20px;
            box-shadow: 0px 0px 3px 0px gray;

            display: none;
        }

        .mybtns {
            display: flex;
            flex-direction: row;
            align-items: center;
            justify-content: center;

            margin-top: 20px;
        }

        .mybtns>button {
            margin-right: 10px;
        }
    </style>
    <script>
        $(document).ready(function () {
            listing();
        });

        function listing() {
            fetch('/movie').then((res) => res.json()).then((data) => {
                let rows = data['result']
                $('#cards-box').empty()
                rows.forEach((a) => {
                    let comment = a['comment']
                    let title = a['title']
                    let desc = a['desc']
                    let image = a['image']
                    let star = a['star']

                    let star_repeat = '⭐'.repeat(star)

                    let temp_html = `<div class="col">
                                        <div class="card h-100">
                                            <img src="${image}"
                                                class="card-img-top">
                                            <div class="card-body">
                                                <h5 class="card-title">${title}</h5>
                                                <p class="card-text">${desc}</p>
                                                <p>${star_repeat}</p>
                                                <p class="mycomment">${comment}</p>
                                            </div>
                                        </div>
                                    </div>`
                    $('#cards-box').append(temp_html)
                })
            })
        }

        function posting() {
            let url = $('#url').val()
            let comment = $('#comment').val()
            let star = $('#star').val()

            let formData = new FormData();
            formData.append("url_give", url);
            formData.append("comment_give", comment);
            formData.append("star_give", star);

            fetch('/movie', { method: "POST", body: formData }).then((res) => res.json()).then((data) => {
                alert(data['msg'])
                window.location.reload()
            })
        }

        function open_box() {
            $('#post-box').show()
        }
        function close_box() {
            $('#post-box').hide()
        }
    </script>
</head>

<body>
    <div class="mytitle">
        <h1>내 생애 최고의 영화들</h1>
        <button onclick="open_box()">영화 기록하기</button>
    </div>
    <div class="mypost" id="post-box">
        <div class="form-floating mb-3">
            <input id="url" type="email" class="form-control" placeholder="name@example.com">
            <label>영화URL</label>
        </div>
        <div class="input-group mb-3">
            <label class="input-group-text" for="inputGroupSelect01">별점</label>
            <select class="form-select" id="star">
                <option selected>-- 선택하기 --</option>
                <option value="1">⭐</option>
                <option value="2">⭐⭐</option>
                <option value="3">⭐⭐⭐</option>
                <option value="4">⭐⭐⭐⭐</option>
                <option value="5">⭐⭐⭐⭐⭐</option>
            </select>
        </div>
        <div class="form-floating">
            <textarea id="comment" class="form-control" placeholder="Leave a comment here"></textarea>
            <label for="floatingTextarea2">코멘트</label>
        </div>
        <div class="mybtns">
            <button onclick="posting()" type="button" class="btn btn-dark">기록하기</button>
            <button onclick="close_box()" type="button" class="btn btn-outline-dark">닫기</button>
        </div>
    </div>
    <div class="mycards">
        <div class="row row-cols-1 row-cols-md-4 g-4" id="cards-box">
            <div class="col">
                <div class="card h-100">
                    <img src="https://movie-phinf.pstatic.net/20210728_221/1627440327667GyoYj_JPEG/movie_image.jpg"
                        class="card-img-top">
                    <div class="card-body">
                        <h5 class="card-title">영화 제목이 들어갑니다</h5>
                        <p class="card-text">여기에 영화에 대한 설명이 들어갑니다.</p>
                        <p>⭐⭐⭐</p>
                        <p class="mycomment">나의 한줄 평을 씁니다</p>
                    </div>
                </div>
            </div>
            <div class="col">
                <div class="card h-100">
                    <img src="https://movie-phinf.pstatic.net/20210728_221/1627440327667GyoYj_JPEG/movie_image.jpg"
                        class="card-img-top">
                    <div class="card-body">
                        <h5 class="card-title">영화 제목이 들어갑니다</h5>
                        <p class="card-text">여기에 영화에 대한 설명이 들어갑니다.</p>
                        <p>⭐⭐⭐</p>
                        <p class="mycomment">나의 한줄 평을 씁니다</p>
                    </div>
                </div>
            </div>
            <div class="col">
                <div class="card h-100">
                    <img src="https://movie-phinf.pstatic.net/20210728_221/1627440327667GyoYj_JPEG/movie_image.jpg"
                        class="card-img-top">
                    <div class="card-body">
                        <h5 class="card-title">영화 제목이 들어갑니다</h5>
                        <p class="card-text">여기에 영화에 대한 설명이 들어갑니다.</p>
                        <p>⭐⭐⭐</p>
                        <p class="mycomment">나의 한줄 평을 씁니다</p>
                    </div>
                </div>
            </div>
            <div class="col">
                <div class="card h-100">
                    <img src="https://movie-phinf.pstatic.net/20210728_221/1627440327667GyoYj_JPEG/movie_image.jpg"
                        class="card-img-top">
                    <div class="card-body">
                        <h5 class="card-title">영화 제목이 들어갑니다</h5>
                        <p class="card-text">여기에 영화에 대한 설명이 들어갑니다.</p>
                        <p>⭐⭐⭐</p>
                        <p class="mycomment">나의 한줄 평을 씁니다</p>
                    </div>
                </div>
            </div>
        </div>
    </div>
</body>

</html>

'TIL > 웹 초보' 카테고리의 다른 글

23.04.17  (0) 2023.04.17
23.04.14  (0) 2023.04.14
챗GPT로 웹사이트 만들기  (0) 2023.04.13
23.04.13  (0) 2023.04.13
23.04.12  (0) 2023.04.12