Recommendation Algorithm Case 1:

오석 양
2023년 8월 7일
4분 분량

최종 수정일: 2023년 8월 9일

An Application of the Hierarchical Ranking Theory between Preference and Distance for Tourism

Python Coding & Its Explanation

In this section, we will discuss the Python coding to implement a preference-distance based hierarchical application algorithm.

import pandas as pd

import numpy as np

from selenium import webdriver

from selenium.webdriver.common.keys import Keys

from selenium.webdriver.support.ui import WebDriverWait

from selenium.webdriver.support import expected_conditions as EC

from selenium.common.exceptions import TimeoutException

from selenium.common.exceptions import NoSuchElementException

from selenium.webdriver.common.by import By

from selenium.webdriver.support.ui import Select

import time

from tqdm import tqdm

☞ Explanation

It appears that you have imported several Python libraries, including pandas, numpy, and selenium, as well as some specific modules from selenium like webdriver, Keys, WebDriverWait, expected_conditions, TimeoutException, NoSuchElementException, and Select. Additionally, you have imported the time module for handling time-related operations and the tqdm library for creating progress bars.

df = pd.read_excel('./All_강원관광지_춘천시.xlsx')

print(df)

☞ Explanation

The given code uses the pandas library to read an Excel file and create a DataFrame. It then prints the DataFrame to display the data.

from sklearn.feature_extraction.text import CountVectorizer

from sklearn.metrics.pairwise import cosine_similarity

count_vect_category = CountVectorizer(min_df=0, ngram_range=(1,2))

place_category = count_vect_category.fit_transform(df['naver_store_type'])

place_simi_cate = cosine_similarity(place_category, place_category)

place_simi_cate_sorted_ind = place_simi_cate.argsort()[:, ::-1]

☞ Explanation

The provided code uses the sklearn library for text feature extraction and cosine similarity calculation. Specifically, it uses the CountVectorizer for vectorizing the text data and cosine_similarity for computing cosine similarity between vectors.

count_vect_review = CountVectorizer(min_df=2, ngram_range=(1,2))

place_review = count_vect_review.fit_transform(df['naver_blog_review_txt'].values.astype('U'))

place_simi_review = cosine_similarity(place_review, place_review)

place_simi_review_sorted_ind = place_simi_review.argsort()[:, ::-1]

☞ Explanation

The provided code uses the sklearn library for text feature extraction and cosine similarity calculation with review text data.

place_simi_co = (

+ place_simi_cate * 0.3

+ place_simi_review * 10

+ np.repeat([df['naver_blog_review_qty'].values], len(df['naver_blog_review_qty']) , axis=0) * 0.001 + np.repeat([df['naver_star_point'].values], len(df['naver_star_point']) , axis=0) * 0.005 + np.repeat([df['naver_star_point_qty'].values], len(df['naver_star_point_qty']) , axis=0) * 0.001

)

place_simi_co_sorted_ind = place_simi_co.argsort()[:, ::-1]

def find_simi_place(df, sorted_ind, place_name, top_n=10):

place_title = df[df['name'] == place_name]

place_index = place_title.index.values

similar_indexes = sorted_ind[place_index, :(top_n)]

similar_indexes = similar_indexes.reshape(-1)

return df.iloc[similar_indexes]

☞ Explanation

The provided code is creating a formula to combine the importance of five different similarities (formular 1~5) into a single formula using weighted coefficients. The resulting formula place_simi_co combines category similarity (place_simi_cate), review text similarity (place_simi_review), the number of blog reviews (df['naver_blog_review_qty']), blog star rating (df['naver_star_point']), and the quantity of blog star ratings (df['naver_star_point_qty']) with different weighted coefficients.

In this code, place_simi_co is the combined similarity matrix obtained by applying the weights to each individual similarity matrix. place_simi_co_sorted_ind contains the sorted indices of the combined similarity matrix. The find_simi_place function takes the DataFrame df, the sorted indices sorted_ind, a place_name, and an optional top_n parameter, and returns the top N similar places based on the combined similarity matrix.

The function find_simi_place can be used to find similar places given a place name and specify the number of top similar places to retrieve. The returned DataFrame will contain the top N similar places ranked by their combined similarity values.

from haversine import haversine

for i in range(len(df)):

df['latitude'].iloc[i] = float(df['latitude'].iloc[i])

df['longitude'].iloc[i] = float(df['longitude'].iloc[i])

☞ Explanation

The haversine library is used to calculate the distance between two points on the Earth's surface. It is commonly used to compute the distance between two geographical coordinates specified by their latitude and longitude.

In the above code, we calculate the distance between Seoul and Tokyo using their respective latitude and longitude. The haversine function takes two tuples of latitude and longitude (in the format (latitude, longitude)) as input and returns the distance in kilometers.

The provided code appears to convert the latitude and longitude values in the DataFrame df from strings to float values. This conversion is likely done to ensure that the latitude and longitude values are in the correct numerical format for further calculations or processing, such as using the haversine library to compute distances.

df['location'] = ''

for i in range(len(df)):

df['location'].iloc[i] = (df['latitude'].iloc[i], df['longitude'].iloc[i])

☞ Explanation

The provided code creates a new column 'location' in the DataFrame df and populates it with tuples containing latitude and longitude values for each row. This approach allows you to manage the latitude and longitude of each location as a single tuple.

print("추천 관광지에 거리 제한이 필요하다면 1을 입력하세요. (거리가 상관없다면 enter 입력)")

dist = input()

if dist == "1":

print("다음 중 이용하고 계시는 이동수단이 무엇입니까?\n1.도보\n2.자전거\n3.자동차")

display(Image("C:/Users/media/gangwontour/transport.jpg"))

trans = input()

if trans == "1":

set_dist = 3

elif trans == "2":

set_dist = 7

elif trans == "3":

set_dist = 30

☞ Explanation

The provided code is an interactive Python script that takes user input to set a distance restriction for recommended tourist attractions. The code uses the input() function to collect the user's choice for the distance restriction and the mode of transportation. Based on the user's input, a specific distance value is set for the recommendation.

Based on the user's choice of transportation, the code sets the distance limit:

If the user chooses "1" for walking, the distance limit set_dist is set to 3 kilometers (30 minutes of walking).

If the user chooses "2" for cycling, the distance limit set_dist is set to 7 kilometers (30 minutes of cycling).

If the user chooses "3" for driving, the distance limit set_dist is set to 30 kilometers (30 minutes of driving).

The code allows users to interactively set a distance restriction for recommended tourist attractions based on their preferred mode of transportation.

for i in range(len(df)):

print(df['name'].iloc[i])

☞ Explanation

The provided code iterates through the DataFrame df and prints the names of the places contained in the 'name' column. It displays the names of all the places present in the DataFrame.

print("자주가는 춘천 관광지를 입력하세요.")

cafe_name = input()

simi_df = find_simi_place(df, place_simi_co_sorted_ind, cafe_name, 10)

☞ Explanation

By running this code, you can find other tourist spots in Chuncheon that are similar to the frequently visited tourist spot provided by the user. The results will be stored in the DataFrame simi_df, and you can explore the similar tourist spots by examining the content of simi_df.

for i in range(len(df)):

if df['name'].iloc[i] == cafe_name:

rec = i

☞ Explanation

The provided code is a loop that searches for a specific value in the 'name' column of the DataFrame df. It assigns the index of the row where the value is found to the variable rec.

simi_df['distance'] = ''

for i in range(len(simi_df)):

simi_df['distance'].iloc[i] = haversine(df['location'].iloc[rec], simi_df['location'].iloc[i])

☞ Explanation

The provided code calculates the distances between the location of the frequently visited tourist spot (cafe_name) and the locations of other similar tourist spots found in the simi_df DataFrame. It uses the haversine function from the haversine library to compute the distances based on latitude and longitude coordinates.

rec_df = simi_df[simi_df['distance'] < set_dist]

final_rec_df = rec_df[simi_df['distance'] != 0]

☞ Explanation

The provided code filters the simi_df DataFrame to create two new DataFrames: rec_df and final_rec_df. These new DataFrames contain rows that meet specific conditions related to the 'distance' column.

from IPython.display import display, Image

for i in range(len(final_rec_df)):

print(final_rec_df.iloc[i]['name'])

display(Image("./chuncheontour/" + final_rec_df.iloc[i]['name'] + ".jpg"))

☞ Explanation

The provided code prints the names of the recommended tourist spots in final_rec_df DataFrame and displays the images of each spot. The images are expected to be stored in the "./chuncheontour/" folder with filenames corresponding to the names of the tourist spots. However, if the images are not available in the folder, it may raise an error.

Results of Processing Recommendation Algorithm

When the above series of coding is processed, the following result is presented:

봄내길 2코스(물깨말구구리길)

추곡약수터

청평사

천전리 지석묘군

의암호

지암계곡

집다리캠핑장

춘천 자전거길

죽림동성당

import openai

api_key = "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"

openai.api_key = api_key

def call_chatgpt(prompt):

response = openai.Completion.create(

engine="text-davinci-003",

prompt=prompt,

max_tokens=50,

stop=None

)

return response.choices[0].text.strip()

user_input = input("파이썬 코드 실행 후 ChatGPT로 넘어가려면 엔터를 누르세요.")

conversation = ""

while True:

user_message = input("사용자: ")

if user_message.lower() == "exit":

break

conversation += f"사용자: {user_message}\n"

response = call_chatgpt(conversation)

conversation += f"ChatGPT: {response}\n"

print("ChatGPT:", response)

☞ Explanation

In this session, after receiving recommendations for tourist attractions in Chuncheon with the deep learning recommendation algorithm, ask chatGPT to recommend tourist attractions in Chuncheon and listen to the answers. If you go to your own chatGPT API and enter a question, ‘Can you recommend a tourist spot in Chuncheon?’, chatGPT will make a recommendation. However, chatGPT API requires a paid account.

As mentioned earlier, since chatGPT is a recommendation based on general information rather than personal preference, tourists themselves have to compare the Chuncheon tourist destinations recommended by the deep learning recommendation algorithm with the tourist attractions recommended by chatGPT and select the final destination to visit.

물론이죠! 춘천은 아름다운 자연 경관과 다양한 관광지로 유명한 도시입니다. 아래는 몇 가지 추천하는 춘천 관광지 목록입니다:

남이섬: 남이섬은 춘천호의 중앙에 위치한 작은 섬으로, 예술과 자연이 조화를 이루는 장소입니다. 나무와 꽃으로 가득 찬 독특한 분위기와 예술 작품들을 감상할 수 있습니다.

경포대: 동해 바다를 내려다보며 조망할 수 있는 경치가 아름다운 경포대는 일출이나 일몰을 관람하기에도 좋은 곳입니다.

강촌 레일파크: 강촌 역 주변에 있는 철도 관련 시설과 철도 자동차를 활용한 테마파크입니다. 가족이나 친구와 함께 즐길 수 있는 재미있는 체험을 제공합니다.

춘천 명동 영화의 거리: 영화 '왕의 남자', '써니' 등이 촬영된 장소로, 유명한 영화 촬영 장면을 재연할 수 있는 재미난 체험을 제공합니다.

소양강 스카이워크: 소양강 위에 위치한 유리 다리로, 고요한 자연 풍경과 함께 무농약 채소 밭을 감상할 수 있습니다.

가평 레일바이크: 춘천에서 가까운 가평에 위치한 레일바이크는 옛 철길을 활용하여 즐기는 레저 스포츠로, 아름다운 경치를 즐기며 산책을 할 수 있습니다.

남이섬 워터월드: 여름철에는 남이섬에서 워터파크를 즐길 수 있는 워터월드가 운영됩니다. 물놀이와 미니 워터 슬라이드 등을 즐길 수 있습니다.

춘천 명동 난타극장: 전통적인 한국 난타 공연을 감상할 수 있는 장소로, 화려한 의상과 화물, 사람이 함께 노래하며 펼치는 무대가 인상적입니다.

This time, we will write coding that requests chatGPT to provide detailed information about the tourist destinations recommended by the deep learning recommendation algorithm

import openai

api_key = "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"

openai.api_key = api_key

def call_chatgpt(prompt):

response = openai.Completion.create(

engine="text-davinci-003",

prompt=prompt,

max_tokens=50,

stop=None

)

return response.choices[0].text.strip()

user_input = input("파이썬 코드 실행 후 ChatGPT로 넘어가려면 엔터를 누르세요.")

prompt = f"Tell me more about {user_input}"

conversation = ""

while True:

user_message = input("사용자: ")

if user_message.lower() == "exit":

break

conversation += f"사용자: {user_message}\n"

response = call_chatgpt(conversation)

conversation += f"ChatGPT: {response}\n"

print("ChatGPT:", response)

☞ Explanation

This coding is designed to collect detailed information about the final tourist destination recommended by the deep learning recommendation algorithm and provide it to the tourist.

The End

Recommendation Algorithm Case 1:

최근 게시물

댓글

Hi !